Facebook and data..
This is a post on Facebook data.. my Facebook friend list data..
Given I tell FB a lot, it was quite interesting to see how FB doesn't make it easy for me to even track what is going on with my Friend list and update me on what is changing.
A few hours ago, I was trying to see how many of my friends post politial stuff on my feed. I could count 8 people off the top of my head and by skimming my feed over last week. Given it is holiday season, it is easy to spot the political posts.. My count was accurate. My entire feed is pretty much overwhelmed by these 8 people. Maybe the others dont post much? Maybe the most vocal ones are the ones that keep forwarding stuff to everyone on their timeline? Maybe the things they are forwarding are paid ads masquerading as meme's or just regular forwards? Maybe the people are not who they claim to be on FB and the account is hijacked? a lot of thoughts led me to try and find out again how after reducing my friend list to 150 two years ago, it is back at 318 now.. what has changed in the friend mix etc. etc.
Any analysis starts with the data set. Trying to export just the list of friends and the number of friends they have (or mutual friends) is not trivial. I did spend a lot of time copy pasting stuff from HTML to csv to JMP and had to do some clean up just to get the data in three Columns.. (Friends name, their friends total, mutual friends). Now why FB lists only Mutual friends in some cases I do not know. At first glance it looks like some folks do NOT want to even disclose how many friends they have!! I don't know what the point is of being on a social network if you are that worried. Let's move on..
Then I found that some of my friends have deactivated their accounts. Their pictures are blank and there was nothing listed. When I click on that friends page it says "this account is deactivated. you have the option to unfriend this person". WTH? This was for multiple people. So the person just left FB without a note of any kind to say they are deactivating it?
72 of my friends show only a mutual friend count. So it makes it difficult to do a fair distribution.. that was a painful realization that there was no way to get the friend count. You have to go to that friends page and see if it will show up. Even my wife's page will only show we have 152 mutual friends. It wont tell me how many friends she has in total. Not that this count matters but if you are looking at data and you try to see patterns it makes sense to have apples to apples.
Then I put in a classification by how they are my friends.
F- Family (anyone related by blood or marriage)
HS- high school (PS. Senior)
C- college (classmates, juniors, seniors) from IT-BHU
GS - Grad school (All the Drexel and RPI connections)
FF- Family friends (this one is tricky because some of these people we met in social settings, my kids classmates, but Majority are spouses of friends from HS, C, GS or my wifes classmates and their spouses)
BL- friends from my Blogger days who I have managed to stay in touch with till now!
Y- Yoga (there are teachers and there are students, but I just lumped it into one)
W- Friends I made at Cypress Semi at work
Here are the distributions given this classification:
This one is the distribution of the friends my friends have (unfortunately for the 72 people out of 318 where it listed only mutual friends, that data is not included in this).
If you read the anthropology books Guns Germs and Steel or Sapiens, the authors make a case that tribes split up into rival tribes or multiple tribes after their populations cross 140-150. The social structure at that point is unmanageable and the connections become meaningless. At any one point everyone closely interacts with 15 people on a day to day basis. In the last 2 years I have observed this at work, at home, at gatherings and on my social media feeds. People simply become inactive.
Given that there are some celebrities in my FB friend list (read star Yoga teachers, music teachers, public figures) I took them out of the distribution to see what it looks like for the bottom two bars in the previous graph.. ie, people with under 1000 friends.. The people who are between 600 and 1000 are mostly inactive but keep accepting friend requests and looking at some of their pages, I am seriously wondering if there are account issues. Again, companies that pride themselves on AI should be able to see patterns and alert people that one of their friends is multiplying on their friend count in an abnormal fashion that does not make sense.
117/212 still have <300 friends.. and 78/212 are within 200.. Gave this 150 number a lot of thought for the "average person".. read "not someone who is trying to get a following like a Business person or Brand ambassador) and maybe given todays ability to connect more easily, that number can be 300.. It should be able to plot the number of active interactions between me and my friends to verify this. I am sure FB does this or has the capability to do this, but it would be nice for me to do it with data at least as it relates to my account and friends.
In the graph above the top four points on the Family (F) are wife, MIL, SIL and co-sister (see graph below) why it will only give the number of mutual friends for these four people is beyond me. You can see that a lot of my High school and college friends are connected to each other by looking at the mutual friend count and it is the case with the Yoga community. I really don't have an apples to apples on this one as technically all the mutual friend counts are not available.
The graph below is the mutual friend list distribution where FB wont list their total friends but only mutual. Each bar is the number of friends who have a mutual friend list between those windows.. eg. 1 friend (wife) with 150-160 mutual friends, 1 with 50-60 (MIL), 2 friends with 40-50 mutual connections (SIL, co-sister), etc. etc.
The bottom 32 people with 0-10 mutual friends are mostly not sharing their friend list?
So what did I learn from all these exercises of graphing and charting and cleaning up data sets?
1- It is not easy to get exportable data out of Facebook.
2- There is no notification if your friends deactivate accounts or become inactive for prolonged periods of time
3- There is no warning if your friends account is showing abnormal increases in friend counts
4- A lot of my friends on the list have way more than 150 friends just like me.
5- Never got to the actual reason why I started doing this.. which was to see what % of my friends post political stuff. That count is still 8/318 but there is no easy way to classify the posts on an excel file and put a code next to it and pareto it! Out of almost 200 most recent posts in my feed, 50 were political and they came from 8 people.. They all seem to have posts on their pets, yoga, family which always seems to be anywhere between 50-70% of their posts. If there was a select way to mute that, I would be a happier person.
The real question is "Am I seeing all their posts?" or "is FB only showing me select posts from these people that are political?" or "are these people deliberately posting this only to me and a select group?"
No way to tell.
Majority of the posts are travel realated or family photos and pets. For that, I will gladly keep reviewing accounts periodically, clean up stuff and go explain to people why there is no point on being "friends" on FB when I don't get to see what they look like after 5 years but keep getting memes and political forwards for years.
I am just having data withdrawal symptoms.. there is a lull at work and Christmas/New Year is coming.. next I will go look at my Yoga class graphs and charts.. Thankfully, that dataset was generated by me and is very accurate!