One of the challenges with analyzing the list of winners is that it's not published in a form that lends itself to analysis. Therefore, the first thing I had to do is regularize and normalize the list. For example, you want to get all of the percentages into the same column. The same is true of Makers, Origins, and other categories where comparisons can be made.
The first step is to copy and paste the relevant text on the page. You will find the complete, original, list of winners here.
- The next step is to find a way to separate like kinds of data into columns. I did this using programming text editor called Sublime (Mac) using regular expressions to do sophisticated find and replace on delimiters in the text. Fortunately the listings are fairly consistent and regular. I have done a lot of this kind of work on other collections, so it took me about fifteen minutes to get this part of the project done.
Once I'd taken care of everything I could do using regular expressions, the next step was to go in by hand, a process that took somewhere between one and two hours. This work I did in MS Excel. This hand work included:
- Moving the cocoa percentage (when included in the description) into its own column.
- Moving the origin (when included in the description) into its own column.
- Adding a notes column into which I could copy some information from each listing.
- Eliminating redundant information where it made sense. For example, now that percentage and origin were in their own columns they could be removed from the description.
Once all of this was done, what can we learn? What follows are objective observations from the data.
I did the analysis by importing a .CSV version of the file into a database tool called Airtable because it has built-in functions to do most of the analyses rather than my writing custom macros or formulas.
Numbers with an asterisk following them have been updated from the original post.
* Added on September 24, 2018; from the website, the number of entries in this round was “almost 500.”
- 214 awards were given in 30* categories, including Best in Competition and special jury awards, to 67 different entrants. (Entrants submitting chocolate made by another maker, where identified, were not double-counted.)
- 41 Golds were awarded (19% of winners). 90 Silvers were awarded (42% of winners). 83 Bronzes were awarded (39% of winners). 2 Gold “Best in Competition" were given and 22 awards were given in special categories.
- Among winners where a bean origin is specified, the most highly awarded origin was Nicaragua with 33* winners. Peru comes next with 30 winners followed by Madagascar (23) followed by Honduras and Ecuador (8 each). 68 entries did not specify an origin and three entries explicitly stated they were blends.
- Winners hailed from 29 different countries. The country with the most winners was the UK (33), then France (31), followed by Denmark (24), Austria and Italy (15 each), Belgium (13), and Iceland (11) [7 countries] for a total of 127 or 59% of all awards.
- The most highly-awarded maker was Friis-Holm (Denmark) with 22 awards followed by Duffy's (UK) with 12, then Omnom (Iceland) with 11 awards. Together, they received nearly 15% of all awards given. 100% of Friis-Holm awards identified Nicaragua as the county of bean origin, representing over two-thirds of all of the winners using Nicaraguan beans.
I have downloaded the Grand Jury finalist list for the European awards, and a future project is to figure out, of the entries that made it to final judging, the percentage given awards. Of course, we do not (and cannot unless the competition organizers choose to share that information), the number of entries that were not judged for reasons that include being damaged in shipment. Therefore it's not possible to calculate the ratio of entries to winners.
You can find a read-only copy of the database here. If you have any updates to suggest – especially if you see an error or know a percentage cocoa content or an origin for an entry on the list – please let me know in a comment and I will update the database, and the numbers here, accordingly.