Recently, Techsumption embarked on an exciting research project where we had to collect data from various well known social media platforms. The objective of the project was to provide more insights into how companies undertake several actions publicly.

There were a few key difficulties which the team experienced while executing the project:

  • Many of the social media platforms had complex scrapping mechanisms to prevent automated collection of data
  • Many of the social media platforms limit data preview to public users, and one has to be signed in to view relevant conversations
  • Many of the social media platforms had delays in loading of data, and a mixture of both API calls and traditional data extraction methodologies had to be employed

Regardless, the team was able to collect close to hundreds of thousands of data rows and went on further to clean and compare the data in two weeks. Eventually, a final presentation was done for our research partner and we went on to identify key areas of further investigation from the data.

One interesting insight we got was that while data can sometimes present a visual and accurate picture of trends (quantitative), having a qualitative understanding of how and why things happen would present a complete picture and lead to additional insights which are actionable.

These are the steps we took to investigate the data:

  1. We first compared the data on a time and topical basis
  2. We then look for data peaks and troughs across the datasets
  3. After identifying interesting spots (areas before sudden increases or decreases specific to an area of measurement), we would correlate that with real world phenomena and present various reasons as to why certain groups of data would tell us different things about what we were exploring
  4. Right after, we used our proprietary algorithms built in house to very methodically and systematically to identify strong correlation with several assumptions
  5. Eventually, we were able to come up with various interesting observations which are proven statistically and answered initial questions from our research collaborator

