So there I was sitting in my living with a bunch of friends going through old GroupMe chats and reliving the glory days. Then the question popped into my head "Who is the most popular person in this chat?". So I did what any nerd would have done, I went and grabbed my laptop, downloaded my message data and started coding.
GroupMe allows you to download all of your personal / and message data from their web app. Once you are logged in click on your profile and then click "Export My Data".
For this walk through we will be going to the message data section but feel free to explore all of your data. Then select what Group Message or Direct Message you want to analyze. You will then get an Email with a download link to download your message data.
In the download there will be a couple of files but the one that we are most interested in is the message.json
file. Here you are able to see every message that was sent along with some of metadata in one huge JSON file. Here is an example of a single message:
{
attachments: [],
avatar_url: 'https://i.groupme.com/xxxxx',
created_at: 1521294021,
favorited_by: [ '999999' ],
group_id: '11223344123',
id: 'xxxxxxx',
name: 'Elon Musk',
sender_id: '12345',
sender_type: 'user',
source_guid: 'xxxxxxx',
system: false,
text: 'Telsa is sick',
user_id: '11223344'
}
Some of the fields that could be interesting are name
, favorited_by
, and text
. Name
is the sender of the message, favorited_by
is an array of UserIds that have favorited the message and text
is the actual message that the user sent.
Fire up your favorite text editor and lets get cracking. We are gonna write some JavaScript code utilizing NodeJS to do some data analysis. Using Node we have access to the file system using the fs
module and are able to load up our messages.json
file.
const fs = require('fs');
const rawJSON = fs.readFileSync('message.json');
const messageJSON = JSON.parse(rawJSON);
Here we need to parse the rawJSON that we loaded in from the file. This will give us a workable object to use instead of a buffer. Note we are using fs.readFileSync
this means that we are loading in the data in a synchronous fashion opposed to asynchronous. Synchronous vs Asynchronous
Since we are interested in finding what user has the most likes and number of messages sent we are going to want to create an object data structure to store this information. Using an object we are able to store key value pairs. Here the key will be users name and the value will be yet another object with keys of messages and favorites. Objects also allow us to easily make 'updates' if the user is already in the object. If the user is not in the object then we can simply just add it!
{
"Elon Musk": {
"messages": 143,
"favorites": 145
},
"Bill Gates": {
"messages": 243,
"favorites": 234
}
}
The first thing that we are going to want to do is loop over all of the messages. We are going to want to build up our users object with the users from the array of messages. We are also going to want to increment the message count and the favorite count for each user.
let users = {};
//Loop over all of the messages
for(let i=0; i < messageJSON.length; i++){
let tempName = messageJSON[i].name;
let currentMessage = messageJSON[i];
//The user is not currently in our user object
if(users[tempName] === undefined) {
//create a new user with the key of name and the value of {messages: 1, favorites: 0}
users[tempName] = {messages: 1, favorites: 0};
//Add the length of the favorited_by array to the users total favorites
users[tempName].favorites += currentMessage.favorited_by.length;
} else {
//User is already in the user object
users[tempName].messages += 1;
users[tempName].favorites += currentMessage.favorited_by.length;
}
}
console.log(users);
Using node we can run the script by entering the node filename.js
command in your terminal within your project directory. You will see the output written in the console like so.
{
Elon: { messages: 143, favorites: 145},
Bill Gates: { messages: 243, favorites: 234 },
Mila Kunis: { messages: 74, favorites: 24 },
Lil Yachty: { messages: 3, favorites: 33 },
}
Great! Now we can see the number of messages that each person sent and the total likes that they have received.
Through this code I was able to get my question answered of who is the most popular person in a group message. But don't stop here this is just the absolute basics. There are so many more things that you could do. For example, try to put this data into some charting library like d3.js, try to find the most common phrase or word that was sent. You could even build a SAAS for users to get even more analytics with their messages. Go out and build something great!