Mrs Trellis, red-shirts and savings - the limits of text analysis
I've just finished Tim Harford's excellent Messy (paperback just out) alongside Giuseppe Tomasi di Lampedusa's The Leopard and both have thrown up examples of phrases that brought a smile to my face and, along with an example from a conference earlier this year, highlight one of my concerns about automatic forms of text analysis. To be clear, I'm not anti-text analysis as part of a broader piece of work, but I am concerned about the limits of automated analysis and the place of it in research practices.
Language clearly carries meaning, so when we're looking to understand beliefs, narratives, cultural norms, it would be strange to ignore a fundamental element of communication. However, from experience both in my own practice and in working with clients, I've seen that there is a strong temptation to focus on language/text analysis* as the primary source of meaning - and that is something I'm against.
Example 1. To return to Tim Harford, in a chapter on Life, he talks about the limitations of categorisation when filing things:
Making three copies of correspondence and filing once by date, once by topic and once by correspondent is a logical solution for a world in which we cannot predict whether we might need to look up all the letters sent and received late in October 2015, or all the letters about the faulty rumbleflange, or all the letters from a Mrs Trellis.
For some of us, those last two words are a clear signpost that Tim is a well-educated connoisseur of radio comedy programmes.
Example 2. In Lampedusa's Leopard, much of the early part of the novel takes part against the backdrop of Garibaldi's campaign to unify Italy in 1960 - specifically in Sicily. Young men in Garibaldi's army were clearly identifiable by their clothing as shown by this passage later in the novel:
Don Fabrizio did not quite understand; he remembered both the young men in lobster red and very carelessly turned out. "Shouldn't you Garibaldini be wearing a red shirt, though?"
For any Star Trek fans or readers of this old post, there are two important words in there...
Example 3. At the UN Data Innovation Lab event earlier this year, one piece of research into Google searches highlighted that in one country (from memory, it was Colombia - apologies if I'm mis-remembering) there were two interesting insights - when the economy is dipping, people search more often for "savings" and when the economy is rising, people search more often for "shoes". Interesting as a possible indicator (presuming that the searches precede the economic move), but in terms of understanding what people need, "savings" isn't as helpful as it might be.
For the examples, there are deeper layers of meaning that require contextual understanding:
- "Mrs Trellis" is a regular correspondent to "I'm Sorry I Haven't A Clue" - a long-running BBC Radio 4 show. The mere mention of her name to regular listeners evokes laughter.
- "Red shirts" has a very different meaning depending on whether you're a Star Trek fan (see here) or reflecting on a significant moment in Italian history.
- "Savings" has multiple possible interpretations - is that savings you make when buying something at a discount or savings you put into a separate account for unexpected problems or purchases ?
Human interpretation by culturally and contextually appropriate people will help elicit those multiple layers of meaning. And algorithmic interpretation may help to group and theme material if it repeats in text-based data.
When we are looking to understand the meaning of large volumes of qualitative material and micro-narratives, we are better off relying first on meaning that has been signified by the contributors - the meta-data added by respondents in SenseMaker® work. Using that meta-data is a better indication of meaning initially - and then we can identify clusters of stories and text that can throw further illumination. At that secondary level of analysis, I think text analysis can significantly help - but not before.
There are, however, three other significant issues when using text analysis (or even over-simplistic tools like Wordle, as I did in the early days of working with narrative a decade ago).
- Complex systems work needs information about modulators, decision-mechanisms, rituals, boundaries, identities and the like - elements that rarely come out of text analysis tools. I've yet to see a text- or language-based tool that does anything other than show themes or connections between words - with the increasingly frequent addition of thesaurus-like elements to group similar concepts into single groups or phrases. If we are looking to understand what is going on in a human system and to identify potential interventions or nudges, then we need to build a framework of questions and meta-data with that in mind - it is unlikely to come from text analysis.
- The underlying algorithms need careful consideration - particularly in government and NGO use. We're seeing the unintended and damaging effects of automated algorithms in many cases - as Cathy O'Neil, author of Weapons of Math Destruction blogs about here. Facial recognition systems have been questioned and the consequences of flawed algorithms could be significant. The same questions need therefore to be asked of text analysis algorithms - how do they cluster, what do they dismiss, what inherent biases do we need to be aware of?
- The final point is a personal hang-up and may be arrogance on my part. From experience, I have seen clients respond immediately to wordclouds and clusters, I've seen others start searching immediately for particular words and assume a hypothesis is proved by their presence. The attractiveness and intuitive communication of text-based data visualisations is appealling - but I believe we have a responsibility as consultants to help clients understand the deeper issues. If - as is often the case - they will leap to conclusions from a wordcloud and then pay less attention to the more rigorous meta-data-based material, I think we need to focus on the better-informed but less attractive element.
Ten years ago, before I used SenseMaker®, I would happily generate wordclouds from material I'd gathered with clients. Once I'd realised it was misleading but appealling, I stopped - and we haven't used wordclouds since. These days I'm prepared to use them, but only as a secondary data visualisation to cast light on what has emerged from clusters of meta-data.
But I'm operating on less-than-perfect information - does anyone have any experience or deep knowledge that might help me put some of my concerns above to rest?
*I'm using language/text analysis generically here - I haven't yet done the research into the various analysis tools available. I have no doubt that, like any tool, there will be some that are better than others. My concerns stand in the face of any automated tool that claims to make meaning from fragmented, natural language.
- Narrative (100)
- Organisational culture (95)
- Communications (93)
- Complexity (77)
- SenseMaker (77)
- Changing organisations (42)
- Cognitive Edge (37)
- Narrate news (35)
- narrative research (32)
- Cognitive science (25)
- Tools and techniques (25)
- Conference references (24)
- Recommendations (20)
- datespecific (20)
- Leadership (17)
- Employee engagement (16)
- Storytelling (15)
- Culture (13)
- Events (11)
- UNDP (11)
- Cynefin (10)
- hints and tips (10)
- internal communications (8)
- Engagement (7)
- Knowledge (6)
- M&E (6)
- customer insight (6)
- tony quinlan (6)
- Branding (5)
- Stories (5)
- culture change (5)
- Children of the World (4)
- Dave Snowden (4)
- Changing organisations (3)
- Courses (3)
- GirlHub (3)
- Medinge (3)
- Travel (3)
- anecdote circles (3)
- development (3)
- knowledge management (3)
- merger (3)
- micro-narratives (3)
- monitoring and evaluation (3)
- presentations (3)
- BRAC (2)
- Bratislava (2)
- Egypt (2)
- ILO (2)
- Narattive research (2)
- Roma (2)
- Uncategorized (2)
- VECO (2)
- citizen engagement (2)
- corporate culture (2)
- counter-terrorism (2)
- customer satisfaction (2)
- diversity (2)
- governance (2)
- impact measurement (2)
- innovation (2)
- masterclass (2)
- melcrum (2)
- monitoring (2)
- narrate (2)
- organisation culture (2)
- organisational storytelling (2)
- research (2)
- sensemaker case study (2)
- sensemaking (2)
- social networks (2)
- speaker (2)
- strategy (2)
- workshops (2)
- 2012 Olympics (1)
- Adam Curtis (1)
- Allders of Sutton (1)
- CASE (1)
- Cabinets and the Bomb (1)
- Central Library (1)
- Chernobyl (1)
- Christmas (1)
- Disaster relief (1)
- Duncan Green (1)
- ESRC (1)
- Employee surveys (1)
- European commission (1)
- Fail-safe (1)
- Financial Times anecdote circles SenseMaker (1)
- FlashForward (1)
- Fragments of Impact (1)
- Future Backwards (1)
- GRU (1)
- Girl Research Unit (1)
- House of Lords (1)
- Huffington Post (1)
- IQPC (1)
- Jordan (1)
- Joshua Cooper Ramo (1)
- KM (1)
- KMUK2010 (1)
- Kharian and Box (1)
- LFI (1)
- LGComms (1)
- Lant Pritchett (1)
- Learning From Incidents (1)
- Lords Speaker lecture (1)
- MLF (1)
- MandE (1)
- Montenegro (1)
- Mosaic (1)
- NHS (1)
- ODI (1)
- OTI (1)
- Owen Barder (1)
- PR (1)
- Peter Hennessy (1)
- Pfizer (1)
- Protected Areas (1)
- Rwanda (1)
- SenseMaker® collector ipad app (1)
- Serbia (1)
- Sir Michael Quinlan (1)
- Slides (1)
- Speaking (1)
- Sutton (1)
- TheStory (1)
- UK justice (1)
- USS vincennes (1)
- United Nations Development Programme (1)
- Washington storytelling (1)
- acquisition (1)
- adaptive management (1)
- afghanistan (1)
- aid and development (1)
- al-qaeda (1)
- algeria (1)
- all in the mind (1)
- anthropology (1)
- applications (1)
- back-story (1)
- better for less (1)
- case study (1)
- change communications (1)
- change management (1)
- citizen experts (1)
- communication (1)
- communications research (1)
- complaints (1)
- complex adaptive systems (1)
- complex probes (1)
- conference (1)
- conferences (1)
- conspiracy theories (1)
- consultation (1)
- content management (1)
- corporate values (1)
- counter narratives (1)
- counter-insurgency (1)
- counter-narrative (1)
- creativity (1)
- customer research (1)
- deresiewicz (1)
- deterrence (1)
- dissent (1)
- downloads (1)
- education (1)
- employee (1)
- ethical audit (1)
- ethics (1)
- evaluation (1)
- facilitation (1)
- fast company (1)
- feedback loops (1)
- financial services (1)
- financial times (1)
- four yorkshiremen (1)
- gary klein (1)
- georgia (1)
- girl effect (1)
- girleffect (1)
- giving voice (1)
- globalgiving (1)
- harnessing complexity (1)
- impact evaluation (1)
- impact measures (1)
- information overload (1)
- informatology (1)
- innovative communications (1)
- john kay (1)
- justice (1)
- kcuk (1)
- keynote (1)
- leadership recession communication (1)
- learning (1)
- libraries (1)
- likert scale (1)
- lucifer effect (1)
- marketing (1)
- minimum level of failure (1)
- narrative capture (1)
- narrative sensemaker internal communications engag (1)
- natasha mitchell (1)
- navigating complexity (1)
- new york times (1)
- newsletter (1)
- obliquity (1)
- organisation (1)
- organisational development (1)
- organisational memory (1)
- organisational narrative (1)
- patterns (1)
- pilot projects (1)
- placement (1)
- policy-making (1)
- population research (1)
- presentation (1)
- protocols of the elders of zion. (1)
- public policy (1)
- public relations (1)
- qualitative research (1)
- quangos (1)
- relations (1)
- reputation management (1)
- resilience (1)
- revenge (1)
- ritual dissent (1)
- road signs (1)
- safe-fail (1)
- safe-to-fail experiments (1)
- sales improvement (1)
- satisfaction (1)
- scaling (1)
- seminar (1)
- seth godin (1)
- social coherence (1)
- social cohesion (1)
- solitude (1)
- stakeholder understanding (1)
- strategic communications management (1)
- strategic narrative (1)
- suggestion schemes (1)
- surveys (1)
- survivorship bias (1)
- targets (1)
- tbilisi (1)
- the future backwards (1)
- tipping point (1)
- training (1)
- twitter (1)
- upskilling (1)
- values (1)
- video (1)
- voices (1)
- weak links (1)
- zeno's paradox (1)
- November 2024 (1)
- October 2024 (1)
- September 2024 (2)
- March 2020 (1)
- November 2019 (1)
- August 2019 (1)
- May 2019 (2)
- April 2019 (1)
- November 2018 (2)
- May 2018 (2)
- March 2018 (1)
- February 2018 (1)
- December 2017 (1)
- October 2017 (3)
- September 2017 (1)
- July 2017 (1)
- March 2016 (1)
- February 2016 (1)
- January 2016 (1)
- July 2015 (3)
- May 2015 (1)
- March 2015 (2)
- February 2015 (1)
- December 2014 (1)
- November 2014 (1)
- September 2014 (3)
- August 2014 (1)
- July 2014 (2)
- June 2014 (6)
- May 2014 (3)
- April 2014 (3)
- March 2014 (5)
- January 2014 (4)
- December 2013 (2)
- October 2013 (1)
- August 2013 (1)
- June 2013 (1)
- May 2013 (1)
- March 2013 (1)
- January 2013 (2)
- November 2012 (2)
- October 2012 (4)
- September 2012 (3)
- November 2011 (3)
- August 2011 (1)
- July 2011 (1)
- May 2011 (4)
- April 2011 (3)
- March 2011 (4)
- February 2011 (8)
- January 2011 (8)
- December 2010 (2)
- November 2010 (5)
- October 2010 (8)
- September 2010 (5)
- August 2010 (2)
- June 2010 (1)
- April 2010 (2)
- January 2010 (1)
- December 2009 (2)
- November 2009 (3)
- October 2009 (1)
- January 2009 (1)
- July 2008 (4)
- June 2008 (1)
- March 2008 (2)
- January 2008 (3)
- November 2007 (4)
- October 2007 (1)
- September 2007 (1)
- August 2007 (4)
- May 2007 (3)
- March 2007 (1)
- February 2007 (6)
- January 2007 (3)
- November 2006 (8)
- October 2006 (8)
- September 2006 (2)
- August 2006 (5)
- July 2006 (13)
Subscribe by email
You May Also Like
These Related Stories