AI automation for GR Work: some applications
Today's guest post discusses how to use AI to summarize GR-relevant texts
Today’s issue is a guest post by Christian Dippel, co-founder and CEO of LobbyIQ, talking about AI and technology automation in government relations work. Christian also discusses how LobbyIQ is using AI to generate crisp 3-page summaries of parliamentary committee-meetings within an hour of the meetings ending.
For a real-life application of the discussion here, we post an example summary of one of today’s meeting by the committee that ended less than an hour ago.
In my day job as a Professor of Business Economics and Policy I teach some of Canada’s brightest young business students in the last semester before they graduate. Most have their jobs lined up, and their short term careers laid out when I meet them, but they still worried about the medium term impact of AI on their careers, knowing that it is a question of when, not if, the AI accountant, AI consultant and AI financial advisor will become viable consumer options that can disrupt their chosen professions.
One thing I tell my students is that our definition of skill is going to change over the next decade. We have grown used to measuring skill in years of education. In economics research, the words “skill-premium” and “college-premium” are usually treated as if they were the basically same thing. (Not surprising perhaps, since it makes researchers feel more skilled.) In ten years, however, my guess is that skill will basically come to be defined by how easily and how well a machine (software or hardware) could do our job.
In fact, this has always been true, it’s just that for the last fifty years, the type of work people with college-degrees tended to do was mostly made more productive (“complemented”) by technological improvements, whereas the type of work people without college-degrees tended to do was more often replaced (“substituted”) by technology (eg assembly line factory work and secretarial white-collar work).
In the next ten years, however, AI will automate many of the tasks that have long been done by the college-educated. In that sense, we may come to think of plumbers, carpenters and car mechanics as more skilled than many white-collar professionals.
But I also tell my students that while maximum job security will lie in occupations that maximize distance to technological automation, this also comes at a cost because these occupations mostly do not leverage the upside of technological automation either. In contrast, maximum opportunity will lie in occupations where we can leverage technological automation.
So, how does this apply to government relations work?
Well, there is a lot of grunt work in government relations that can be automated, allowing skilled GR practitioners to free up time and resources to focus on the high-level work.
Today I want to talk about one of the lowest-hanging fruits in this domain, which we have already fully automated at LobbyIQ: summarizing committee-meetings.
1. Summarizing Committee-Meetings with AI
Sitting through a 2-hour committee meeting is not an enviable task: most of what is said ends up not being mission-critical, but we can’t know beforehand if and when critical bits do come up. And we can’t really outsource the task for exactly that reason. What if we ask an intern to monitor the meeting and their attention drops off just when a critical witness was called. Or they don’t have the context to recognize what’s truly important. That’s why it frequently falls on experienced consultants to monitor meetings, pay attention throughout and keep meticulous notes.
It would be a lot easier if there was a live transcript that we could ctrl-F through, but the official transcripts take 10-14 days to get published, and the unofficial BLUES are a band-aid solution at best: they come out one or two days after the meeting, the (often sizable) French parts are not translated, and they are images, not text, so we can’t control-F through them or copy from the text.
So this is an area of GR work where AI can really take a load off the grunt aspect of the job, and I want to talk a little bit about how we do this at LobbyIQ.
The first step is to turn audio into text, a technology that’s been around for a long time. It’s a form of AI, depending how broadly we want to define the term artificial intelligence, but it does not rely on the large language models (LLMs) that have dominated the discourse in the last 14 months.
As a result, no sentence in the initial generated text is using context from any of the other sentences and the initial output is therefore riddled by phonetically similar but incorrect words, and by wrong grammar. It is basically illegible, or at least very painful to read.
The second step is therefore to use AI to generate a proper transcription from the initial text. This is pretty easy, basically a tuned-up version of the spelling- and grammar-correction models we have been accustomed to for twenty years.
At this stage, we have a reasonably good full transcript of each meeting, and we are in the ballpark of where we would be if we waited two weeks for the official transcript to come out.
Enter the third and final step, which is to summarize this text. Summarization using AI is not as straightforward as one might think. This is because LLMs are actually quite limited in how much input they can take in one go, and how much output they can return. For example, when we ask ChatGPT for a summary of the novel War and Peace, the GPT model does not actually summarize War and Peace (although the novel may be contained in the corpus of text that was used to train GPT). Instead it summarizes one or several summaries of War and Peace that were generated elsewhere by a human (eg on Wikipedia) and became part of GPT’s corpus of training text.
If we actually use an LLM like GPT to summarize the transcript of a committee-meeting, we quickly run into constraints, regardless if we go the low-tech route of copy-pasting into the ChatGPT user interface or the high-tech route of programmatically feeding text into the GPT API using tokens. On the ChatGPT interface, we have a word-count constraint of 2048, but the average committee meeting runs a word-count of about seven or eight times that (about 16,000 words on a 35-page document).
We therefore need to “chunk” the text into seven or eight separate pieces. In principle, we can then feed them into an LLM one chunk at a time. However, aside from being tedious, this still wouldn’t work because GPT does not remember the context that is needed to connect the different pieces. As a result, we would get seven or eight separate sub-summaries. The problem with this is that we get a lot of repetition (GPT giving context that it has already given in the previous sub-section), and (less annoying but actually more problematic) it may start hallucinating things to make sense of what it is summarizing.
At LobbyIQ, we solve this problem with a technique called chain-prompting. It basically involves feeding the summary from the first chunk back into GPT, and explaining to it that this is what happened so far, before asking it to summarize the next chunk. Then we repeat the process as many times as needed to get through the whole document.
What I found most interesting in the process of fine-tuning our chain-prompting code was the extent to which this is an art more than a science. Firstly, explaining to GPT what has happened thus far in a meeting feels weird because it was the same GPT that just summarized this first part for us. It’s as if you’re dealing with a person with massive short-term memory loss. Even weirder, you are now asking that same person to take on the cognitively demanding task of reading that first summary, and summarizing the next chunk in the context of the summaries of the previous chunks. The second thing that feels weird is just how ‘‘conversational” the prompting process is. It feels very similar to how we might explain to a teenager how to write a good essay, and then iterating on our explanation when the output is not satisfactory.
This is why some large organizations have their own designated and full-time “prompt-engineers.”
At the end of this chain-prompting process, we generate crisp, 3-4 pages long, summaries with a word-count of about 12-15% the original transcript.
Interestingly, we cannot tell GPT directly what word-count we want in the final product. Instead we have to use the right language (“not too long but without omitting critical details”) in our prompts to nudge it towards a desired text length.
Is the output perfect? No summary is, including no human-generated summary, ever is perfect. But we think our AI generated ones are very good. They still occasionally generate little redundant mini-summaries in the middle of the text, because GPT’s strong tendency to summarize within-chunk occasionally revolts against our prompt-instructions not to. But a little bit of redundancy is far less of a problem than overlooking any essential parts of the discussion, and on this front the AI is very good, in the sense that important discussion points don’t seem to ever go overlooked.
And considering that we have the prompting configured down to the point where the process is fully automated, and we can generate 3-4 page meeting summaries within an hour of a meeting ending, a little redundancy seems well worth it.
Clients can then decide whether to use this only as a tool to double-check and compare the notes they took themselves. Or they may decide to skip the meeting altogether, and let the summaries guide them on whether they need to watch any of the parts on ParlVu.
For reference, we posted a summary of a meeting that ended 25 minutes ago here.
2. Future Applications of AI to Government Relations
At LobbyIQ, we currently use AI in only a handful of processes. Our first application was to use it to discover the key words and key phrases that characterized the sum of all lobby registration descriptions underlying all communications in a sector, which gives us a sense for the themes that drove communications by industry by month.
Our second application is the minute summarization I just described.
Another interesting application is to extract “narratives”, i.e. asking what different entities (eg parties, sectors, or institutions) are talking about at a high level. This is an interesting task as it involves identifying and compressing meta-structures that may be expressed in hundreds or thousands of different ways.
That’s AI on the backend. On the frontend, we are also working on a Chatbot that could help our users navigate our platform.
In the medium term, I think AI can become really useful in predicting policy-direction. I think, for example, that we can predict the final shape of a regulation based on the regulation’s initial draft and based on the public comments (including as an input probably the identity of the commenter). Directionally, this sounds doable to me with current AI capabilities. It may involve a lot of model training, but the training data exists.
As a side-note, AI will almost certainly put an end to anonymous commenting “by an individual” in the public comments. I just don’t think the government can otherwise prevent AI bots from being employed to game the commenting system.
Another interesting application may be to predict changes in the tone and text of proposed legislation as it goes through the stages from First and Second Reading to Committee and Third Reading.
A third really interesting application will be to predict the legislative agenda of a hypothetical future Conservative government, based on what it it is saying while it forms the official opposition. Characterizing the legislative intentions of a hypothetical future government does feel possible with enough historical text data, although there will always be a very large uncertainty factor about how unknowable future developments will shape future legislative agendas, e.g. when the 2008 financial crisis sucked the life out of the Obama administration’s healthcare reform agenda.