This page contains interactive figures associated with the MSLC24 paper: the full Matrix, System Averages, and Empty Input or Empty Hypothesis plots. Using the dropdown menus and other interactive elements, you can select the language pair or other features of the plots.
Matrix of segment-level scores for . Along the diagonal are stacked histograms of segment scores across the challenge set (cool colours/bottom) and submitted WMT systems (warm colours/top). The off-diagonal entries are scatterplots where each point is a single segment positioned according to the score assigned to it by row and column metrics; each point is coloured according to the same colours as the histogram.
Hovering over the plots will provide tooltips with information. You can change the language pair using the dropdown menu, select the subset of metrics to examine using the metric buttons, and highlight in black a particular system by hovering your over the system name in the legend. Note that there may be slight delays.
System average scores for . MSLC systems (cool colours, left) are ordered by BLEU score and brief manual examination; WMT submitted systems (warm colours, right) are ranked by your choice of metric (default: MQM).
Each subfigure shows the scores assigned to the 10 items in each category (punctuation, words, phrases, or sentences/full segments), with the vertical red lines indicating the lowest and highest scores assigned by this metric to any of the WMT news test data for any submitted MT system in this language pair.