Abbott Katz details spreadsheet techniques for prospecting the stories beneath the rows and columns.

Spreadsheet journalism lesson three: losing the pound signs to find stories

Let’s get back to the Ivy-leagued Princeton University Press and its spreadsheet compendium of sociology titles. Somewhere in there a rusty spanner is rattling the works: data that look like numbers but that are in fact formatted as text, and hence currently unusable – unusable, at least, if you want to add, average, or even standard-deviate them.

That spanner was jammed into the data by the familiar pound sign, e.g., £22.95 – not a currency-formatted pound sign here, but rather an actual, typed character that drains 22.95 of its numeric character. The remedy, then, is to lose the pound signs, and in the previous post I let on that the solution at hand was one we’d already learned – the Text to Columns feature (there’s also a formula-based workaround upon which we could call, but it would unleash a couple of complications we don’t need, and we’ll pass on it here).

So here’s what we can do. Select the entire G column (just click on the letter G topping the column) and click Data tab > Text to Columns in the Data Tools button group.  You should see:

 

(Cost in pounds is the title I improvised for the column; you’ll recall the workbook had no header row).

Now this scenario does, and does not, resemble Step 1 in the Text to Columns exercise we walked through in my first post (you may want to quickly revisit that discussion). There, Excel characterised the data as Fixed width, because the information with which we working indeed evinced “spaces between each field”, as per the legend above:

 

But in the case before us our text-formatted currencies are of a piece – but one clump of text fronted by that pound sign, and so Excel asks us to identify a delimiter – that is, a constant character that signals the breakpoints at which the text in each cell will be cleaved into columns - and if that sounds a bit hifalutin’ you’ll see what I mean in a second.

Click Next and tick the Other box and type £ in the accompanying field:

 

What we’ve issued here is the following instruction: in each cell in which £ appears (that is, every cell in the G column save the one sporting the header title), let that sign serve as the delimiter, sifting out the numeric remainders of the cells – what are now their truly numeric contents. Note that because the £ is playing the delimiter role it will no longer appear in the worksheet as data.

Click Next and you should see: 

 

Click the Do not import column (skip) radio button, because the black-highlighted column above no longer has any cell contents save the title cell, and then click Finish. You should see:

 

There’s your data, now redefined in numeric terms – which you can now properly format with the pound currency symbol. The only casualty of the process – the column title, which you can re-enter now. And now those data can be added, averaged, and even standard-deviated.

Sign up for our free weekly digital magazine, Press Gazette Journalism Weekly, and daily newsletter
To contact Press Gazette with a story call 0207 936 6433
or email pged@pressgazette.co.uk
To advertise, please call 0207 936 6764.