How would you manage this type of information?
February 11, 2025 6:56 PM Subscribe
At my work, we deal with repeated situations where we have to manage a large number of lengthy responses to questions, as well as comments on those responses. We need ways to manage the details, and right now, we are mostly using annoyingly elaborate spreadsheets with a lot of version control problems. What tools or methodologies or ideas do you have on how to manage the info?
This question is purposefully extremely broad because I don't know if I am looking for a software package or information architecture methodology or some other thing. Please assume we cannot change anything at all about how we ask for feedback or what feedback we receive. The question is *only* about how we will manage the information we get back based on what that information looks like now. I know there will someone who wants us to restructure how we ask the questions to get more manageable responses, but that is a battled I cannot fight and would not win.
This is all in a public policy context. To give some context to the kind of thing I am talking about, we might send out a document about an issue (let's say for example "what should the government do to regulate oil fields to preserve the environment") and ask for feedback on the ideas in that document, including a bunch of pointed questions about regulatory authority and how to ensure we don't kill off an important economic industry while also not dying from climate change.
What we get back in response will range from one line emails ("burn baby burn", "oil is murder") to 142 page PDFs that respond to each of the questions in detail with citations and footnotes and engineering studies in the appendixes. We will also receive every possible permutation of responding generally or specifically to some or all of the questions. Some of the responses will be 142 page PDFs about how we should regulate Coal Plants instead of Oil Fields but have a reference to Oil Fields on page 38. Some of the responses will respond to the one question that the stakeholder cares about and ignore the other 37. Some of the responses to questions will literally be 'No' even though the questions are almost never yes or no questions.
After that, there is another round of responses where people comment on the things other people said in the first batch of comments. Those responses may be organized by the original questions or by the person they are responding to or by broad themes or by some secret method intelligible only to the person who wrote the response. These are usually pretty detailed, but it isn't necessarily 1000% clear what, specifically, they are responding to.
We have to use all this information to develop recommendations based on the feedback we receive. Often there are a number of different team members (between 2 and 20) each working on recommendations related to subsets of the original questions. It is not possible for every team member to read every item of feedback -- there is simply too much for that -- but every word of every bit of feedback must be read by at least one team member and then the information has to be shared with the team in a way that allows team members to find all the feedback that is relevant to the questions and issues they are working with.
We also have the people who will actually engage with the stakeholders and make the decisions and we have to be prepared to get them up to speed on the relevant details in the feedback received. They often have to be prepped per stakeholder because they are meeting with one stakeholder at a time -- so they might get a document outlining everything Stakeholder A said about each issue, anything the other stakeholders said about what Stakeholder A said, as well as suggested topics for discussion when they meet with Stakeholder A. They eventually have to be briefed on an issue-by-issue basis, as well, in order to present the recommendations and have them make the final decisions.
Right now, each team that goes through one of these feedback processes typically creates their own tool for managing the responses. Often this is just an excel spreadsheet with massive version control problems or the equivalent of an excel spreadsheet recreated in SharePoint or OneNote with slightly different version control problems. The tables will contain rows that break down each response received by who said it, what question (or group of questions, or theme, or flight of fantasy all their own) they were responding to, a brief summary of what they said, and a copy/paste of the most relevant parts of their submission.
That is a metric fuck-tonne of manual data entry to create, but it works okay with the original submissions for the most part (other than the version control problems) because people can go in and find everything anyone said related to question 2 or to the category of questions that contain question 2. It is a bit harder for the replies, because they have to be classified both by who said something and who they said it about and what question they were talking about and that's not always super wildly clear.
There are records retention reasons to store this information long term, but for the most part, this is information that will be organized once, referenced heavily for 6 months, and then never looked at again. It could be exported to some kind of archive that we can shove in the file system equivalent of a cardboard box under the stairs.
I feel like there has to be a better way of managing this type of information -- something used for organizing responses to legal discovery? something used in academia to code themes in literature or in social sciences to code qualitative responses? an expert to come in and do some information architecture work to develop a system for us? options for automating some or all of the data entry or summarization of responses? better options for finding relevant info within the submissions so we don't have to copy 80% of it into a database to essentially add paragraph by paragraph tags.
This question is purposefully extremely broad because I don't know if I am looking for a software package or information architecture methodology or some other thing. Please assume we cannot change anything at all about how we ask for feedback or what feedback we receive. The question is *only* about how we will manage the information we get back based on what that information looks like now. I know there will someone who wants us to restructure how we ask the questions to get more manageable responses, but that is a battled I cannot fight and would not win.
This is all in a public policy context. To give some context to the kind of thing I am talking about, we might send out a document about an issue (let's say for example "what should the government do to regulate oil fields to preserve the environment") and ask for feedback on the ideas in that document, including a bunch of pointed questions about regulatory authority and how to ensure we don't kill off an important economic industry while also not dying from climate change.
What we get back in response will range from one line emails ("burn baby burn", "oil is murder") to 142 page PDFs that respond to each of the questions in detail with citations and footnotes and engineering studies in the appendixes. We will also receive every possible permutation of responding generally or specifically to some or all of the questions. Some of the responses will be 142 page PDFs about how we should regulate Coal Plants instead of Oil Fields but have a reference to Oil Fields on page 38. Some of the responses will respond to the one question that the stakeholder cares about and ignore the other 37. Some of the responses to questions will literally be 'No' even though the questions are almost never yes or no questions.
After that, there is another round of responses where people comment on the things other people said in the first batch of comments. Those responses may be organized by the original questions or by the person they are responding to or by broad themes or by some secret method intelligible only to the person who wrote the response. These are usually pretty detailed, but it isn't necessarily 1000% clear what, specifically, they are responding to.
We have to use all this information to develop recommendations based on the feedback we receive. Often there are a number of different team members (between 2 and 20) each working on recommendations related to subsets of the original questions. It is not possible for every team member to read every item of feedback -- there is simply too much for that -- but every word of every bit of feedback must be read by at least one team member and then the information has to be shared with the team in a way that allows team members to find all the feedback that is relevant to the questions and issues they are working with.
We also have the people who will actually engage with the stakeholders and make the decisions and we have to be prepared to get them up to speed on the relevant details in the feedback received. They often have to be prepped per stakeholder because they are meeting with one stakeholder at a time -- so they might get a document outlining everything Stakeholder A said about each issue, anything the other stakeholders said about what Stakeholder A said, as well as suggested topics for discussion when they meet with Stakeholder A. They eventually have to be briefed on an issue-by-issue basis, as well, in order to present the recommendations and have them make the final decisions.
Right now, each team that goes through one of these feedback processes typically creates their own tool for managing the responses. Often this is just an excel spreadsheet with massive version control problems or the equivalent of an excel spreadsheet recreated in SharePoint or OneNote with slightly different version control problems. The tables will contain rows that break down each response received by who said it, what question (or group of questions, or theme, or flight of fantasy all their own) they were responding to, a brief summary of what they said, and a copy/paste of the most relevant parts of their submission.
That is a metric fuck-tonne of manual data entry to create, but it works okay with the original submissions for the most part (other than the version control problems) because people can go in and find everything anyone said related to question 2 or to the category of questions that contain question 2. It is a bit harder for the replies, because they have to be classified both by who said something and who they said it about and what question they were talking about and that's not always super wildly clear.
There are records retention reasons to store this information long term, but for the most part, this is information that will be organized once, referenced heavily for 6 months, and then never looked at again. It could be exported to some kind of archive that we can shove in the file system equivalent of a cardboard box under the stairs.
I feel like there has to be a better way of managing this type of information -- something used for organizing responses to legal discovery? something used in academia to code themes in literature or in social sciences to code qualitative responses? an expert to come in and do some information architecture work to develop a system for us? options for automating some or all of the data entry or summarization of responses? better options for finding relevant info within the submissions so we don't have to copy 80% of it into a database to essentially add paragraph by paragraph tags.
I think you could almost make a software version control system like Git (e.g. GitHub) work for this.
The idea is that the documents would exist as Markdown Text documents on GitHub. That only gives you pretty basic text formatting - but probably equal to or better than what you can do in Excel!
You can then edit the documents there in Github (or externally somehow with regular syncronization with Github, as software developers do) and that provides your change and version tracking.
In fact you could even have different people editing & proposing different changes to the documents, then someone would have to referee & decide which changes to accept or not. This, again, is exactly as software developers do via tools like GitHub.
Then, your constituent feedback exists as "issues", as in GitHub. Each response could be a separate issue, or you could break up longer feedback into discrete actionable items and each of those could be an issue.
The nice thing about this is, it then gives a discussion/back-and-forth area for each discrete issue, and also, there is a way to either continue discussion about each issue, give feedback or thoughts about it, mark it as open or closed, and also link it back to any changes in the main documentation that resulted from this issue.
(These are all things that are routinely done in tracking progress & changes in software projects - the similarity with what you are doing is what makes me think of this.)
- Everything including the main documents and issues are easily searchable.
- You could even open this up as a means for constituents to raise issues, discuss them, and so on. Like, they could go to the GitHub page for this project, read it, open and issue to discuss things, and so on (exactly as most software projects use GitHub). But if you don't want to or can't do that, it can still be used internally for organizing & discussing various points. If you do it this way, nothing has to be public facing at all.
An example project that is doing this very sort of thing using GitHub is here.
posted by flug at 10:16 PM on February 11 [1 favorite]
The idea is that the documents would exist as Markdown Text documents on GitHub. That only gives you pretty basic text formatting - but probably equal to or better than what you can do in Excel!
You can then edit the documents there in Github (or externally somehow with regular syncronization with Github, as software developers do) and that provides your change and version tracking.
In fact you could even have different people editing & proposing different changes to the documents, then someone would have to referee & decide which changes to accept or not. This, again, is exactly as software developers do via tools like GitHub.
Then, your constituent feedback exists as "issues", as in GitHub. Each response could be a separate issue, or you could break up longer feedback into discrete actionable items and each of those could be an issue.
The nice thing about this is, it then gives a discussion/back-and-forth area for each discrete issue, and also, there is a way to either continue discussion about each issue, give feedback or thoughts about it, mark it as open or closed, and also link it back to any changes in the main documentation that resulted from this issue.
(These are all things that are routinely done in tracking progress & changes in software projects - the similarity with what you are doing is what makes me think of this.)
- Everything including the main documents and issues are easily searchable.
- You could even open this up as a means for constituents to raise issues, discuss them, and so on. Like, they could go to the GitHub page for this project, read it, open and issue to discuss things, and so on (exactly as most software projects use GitHub). But if you don't want to or can't do that, it can still be used internally for organizing & discussing various points. If you do it this way, nothing has to be public facing at all.
An example project that is doing this very sort of thing using GitHub is here.
posted by flug at 10:16 PM on February 11 [1 favorite]
Guessing you’re at a federal agency that needs to analyze and respond to comments on proposed regulations? Here’s a nice report from 2021 that describes approaches taken by various agencies, and a commercial tool built for this purpose (I haven’t tried it). I worked on a project several years ago to offer the public an optional way to write really well-structured comments, and it worked relatively well for the EPA when they piloted it for one rule, but nobody else picked it up, so I feel you on the constraints here.
posted by dreamyshade at 11:51 PM on February 11 [4 favorites]
posted by dreamyshade at 11:51 PM on February 11 [4 favorites]
I haven’t used it in a long time, but I bet that Dedoose would help with this. Multi-user, supports a wide range of text file types (Word, text, pdf, htm, html), images, video/audio, spreadsheet data (for existing records, or csvs of survey data). It’s for the research and analysis phase, rather than the collecting public comment phase, but if you want to tag/code text and parse by themes, tags, speakers/commenters, etc, it probably would have you covered.
Also, if you want to brainstorm, send me a MeMail! I’d love to volunteer some time helping you noodle on this - I work on the data side of international development, often on streamlining collection/management/analysis processes to make it easier to get to actual discussions and decision-making. My sector is currently imploding because of the new US administration, and I’d love to be able to support someone trying to clean up their data-analysis processes for the public good.
posted by rrrrrrrrrt at 11:57 PM on February 11 [1 favorite]
Also, if you want to brainstorm, send me a MeMail! I’d love to volunteer some time helping you noodle on this - I work on the data side of international development, often on streamlining collection/management/analysis processes to make it easier to get to actual discussions and decision-making. My sector is currently imploding because of the new US administration, and I’d love to be able to support someone trying to clean up their data-analysis processes for the public good.
posted by rrrrrrrrrt at 11:57 PM on February 11 [1 favorite]
(dreamyshade, thanks so much for sharing! Excited to take a look at those links.)
posted by rrrrrrrrrt at 11:58 PM on February 11
posted by rrrrrrrrrt at 11:58 PM on February 11
A couple alternate ideas using off-the-shelf tools:
I’d be curious if you can get access to a somewhat fancier spreadsheet tool, like Airtable, and see what you can do with it. Might be able to set up templates people can use. Benefit is that it’s not too different from SharePoint-type solutions people are accustomed to.
At my work we organize and analyze qualitative user research using Dovetail, and that might be interesting to play with as well. The core functions: input lots of raw data about what people said, tag it in a variety of ways with custom tag vocabularies (coding themes, basically), and write reports with embedded quotes. Reduces duplication, very searchable. On the paid plan level we’re on, it persistently offers to create automated summaries of the raw data, which is helpful sometimes, and not too hard to ignore when you don’t want it.
Notion is another tool in this space, but it has some quirky UI stuff - too many features, trying to do too much, with obtrusive “AI”. It can be challenging to deal with for policy people who are accustomed to Excel trackers - people absolutely not afraid of conceptual complexity, but who want a tool to work in a way that feels logical and predictable.
Somebody might try to convince you that Salesforce is the solution to your problems. I suspect that would result in having several problems.
I wouldn’t hesitate to reach out to the sales teams at these types of companies, describe your challenges, and let them customize a pitch for you about their solution.
posted by dreamyshade at 12:18 AM on February 12 [1 favorite]
I’d be curious if you can get access to a somewhat fancier spreadsheet tool, like Airtable, and see what you can do with it. Might be able to set up templates people can use. Benefit is that it’s not too different from SharePoint-type solutions people are accustomed to.
At my work we organize and analyze qualitative user research using Dovetail, and that might be interesting to play with as well. The core functions: input lots of raw data about what people said, tag it in a variety of ways with custom tag vocabularies (coding themes, basically), and write reports with embedded quotes. Reduces duplication, very searchable. On the paid plan level we’re on, it persistently offers to create automated summaries of the raw data, which is helpful sometimes, and not too hard to ignore when you don’t want it.
Notion is another tool in this space, but it has some quirky UI stuff - too many features, trying to do too much, with obtrusive “AI”. It can be challenging to deal with for policy people who are accustomed to Excel trackers - people absolutely not afraid of conceptual complexity, but who want a tool to work in a way that feels logical and predictable.
Somebody might try to convince you that Salesforce is the solution to your problems. I suspect that would result in having several problems.
I wouldn’t hesitate to reach out to the sales teams at these types of companies, describe your challenges, and let them customize a pitch for you about their solution.
posted by dreamyshade at 12:18 AM on February 12 [1 favorite]
Have you thought about using AI to summarise and categorise the responses, it could drastically help cut down the workload.
posted by Jubey at 1:25 AM on February 12 [1 favorite]
posted by Jubey at 1:25 AM on February 12 [1 favorite]
A database will work better than a spreadsheet. The database would hold information about each response like date and time posted as well as who posted/sent it and which question or comment it was replying to. Version control is much easier in a database because you have jobs or tasks that insert the data and others can only read this data. If there is a need to update it, then those transactions are stored in the database.
posted by soelo at 5:08 AM on February 12
posted by soelo at 5:08 AM on February 12
There is definitely software for this. We use Relativity.
posted by haptic_avenger at 5:44 AM on February 12
posted by haptic_avenger at 5:44 AM on February 12
Response by poster: Thanks everyone who answered so far! I will take a look at these software options and see what might work for us. We probably can't use anything cloud-based in the states for reasons, but at least I have some new ideas about *what* I could be looking at in terms of feature sets, etc.
posted by 20 pairs of identical black socks at 7:56 PM on February 12
posted by 20 pairs of identical black socks at 7:56 PM on February 12
"Sentiment analysis" describes a type of data processing that seems relevant to your interests. I don't have a personal recommendation for a service, but the keyword might help you for further digging.
posted by girlstyle at 9:44 PM on February 12
posted by girlstyle at 9:44 PM on February 12
« Older Google Workspace for Education alternatives | Looking for an excellent piano tuner AND... Newer »
You are not logged in, either login or create an account to post comments
1) Version control (i.e., how to keep track of changes). Excel is not great for this, it just tends to be one of those things that people use because they have it. However, software VCSs (while great for their purpose) are not conducive to your use-case either (for one, they typically really only operate on plain text and you have a lot of not-that).
2) Threading. Who said what to whom, and then what was the response to that, etc.
3) Document attachment/tracking. You might consider the response from stakeholder A to question 12 to be both a short bit of regular text (the proper response) as well as some number of attached PDFs, diagrams, etc. The version control services might only handle the response proper while the attachments are kept outside of the VCS.
4) Data entry. Even if you have a system for inspecting your data, how to create/maintain a process for getting that data into the system in the first place. Forms?
5) Search/discoverability. This includes tagging (of text passages?) but also, like, Ctrl+F style "find me all mentions of $THING" which is marvelously powerful in many ways.
To me this sounds like such an incredibly particular confluence of requirements that there probably isn't some off-the-shelf widget you can just buy that does all of these things the way you want them. The closest you might come to hitting a lot of these birds with one stone is some kind of DMS (Document Management System), have you tried Google Workspace or Box?
posted by axiom at 9:19 PM on February 11 [1 favorite]