7 minutes

The Power of Interest Graph Analysis: Advanced Audience Insights

When tasked with analyzing the likes, habits, and trends of a group, our go-to methodology is usually to consider each individual’s personal inclinations and extrapolate those to say something broadly about the group as a whole. We see this in opinion polls, in user surveys, and sometimes, in social analysis.

Reasoning about a group in this way isn’t inherently problematic; it’s worked very well for a long time. In the advertising and marketing world, persona-driven campaigns have proven to be an enduring and impactful strategy. Besides—we’re naturally good at this. Humans love to label people, and groups: codified language like Millennials and Baby Boomers pack a lot of information into a conveniently generalized term.

Aggregated information about individuals’ traits (e.g. “How much money do you make?”, “How old are you?”) is often turned into peripheral observations when developing personas: “Tends toward upscale shopping habits” or “Decisions motivated by the need for adventure.” While comfortable, these broad generalizations are arrived at often by gut feelings about which users’ responses make up a meaningful trend among the whole group.

This is all to say: trends are supposed to be harmonious, complementary, and provable. A greater-than-the-sum-of-their-parts approach to defining personas is ultimately not very scientific, even less so when the parts themselves are basic demographics like age, gender, or income. New methods for building personas are driving label-and-demographic approaches to obsolescence.

So, if broad categorization is becoming obsolete, what’s the alternative?

At Affinio, the thing we care about is the network. Social media analysis and machine learning lets us look at the connections between individuals, evaluate them, and reason about them in aggregate. Network science phrases things in terms of betweenness centrality and node influence; we use these plus other observations to say things like “this set of users know one another very well”, or “they like the same sorts of things.” Interesting and complex dichotomies arise when we look at these kinds of attributes: what was once a 2-dimensional list of people is suddenly a many-dimensional graph.

It turns out that segmenting and clustering these networks proves to be totally enlightening. Algorithmically-defined clusters tend to be much richer than comparisons using plain demographic labels. Maybe this should come as no surprise: machine learning lets us consider lots of (many millions, actually) interest variables when comparing individuals from a group. Instead of a persona driven by things like “People who live in the southwest and like football”, we can cluster on metrics like “People who follow @AZCardinals and @FootballASU but not @Dbacks or @ASU_Baseball”, and millions of permutations thereof, shockingly quickly.

As for the effect that this has on making persona-based decisions in marketing, advertising, branding and product development: you now have the ability to dive deep. Generalizations and user-provided labels tell a very shallow story. Likes, trends, and interest patterns get at the silent heart of social behaviour: trends are harmonious, complementary, and quantifiable.

How to make sense of the deep end of social data

Once meaningful clusters emerge from our network, a principal task of analysis becomes identifying the differences and similarities between them. One of our main jobs at Affinio is to make this data make sense: our whole platform is built to help bring over-indexing traits and trends to light among social groups, and just as importantly, to show how this differs among other groups.

While the network relationships and interest patterns are the primary metrics we use to build out our different personas, we have many more tools at our disposal to help draw insight from our new segments. We expose follow patterns, self-described locations, and biographical keywords, links shared, content favourited, and many more attributes to help bring clarity to clusters.

1. The ways in which people are different

“For anything to be made whole, the first step is to know what’s missing.” ― Christian Rudder

The things we label and generalize by are just as important for their positive meaning for a given cluster, as they are for the negation among the clusters that don’t share that trait. “Tends to follow @AZCardinals” gives you the ability to make some decisions; “Tends not to follow @AZCadinals”, for otherwise-related clusters, gives you the ability to make many more. This is another fundamental difference between social analysis with machine learning and traditional persona building with polling: the absence of a trend makes itself apparent immediately.

We see these trends emerge in aggregate metrics, as well: differences between clusters with respect to the time of day the people therein are active online; the amount of engagement they produce, etc. are easily-identified distinctions, but are often red herrings when it comes to evaluating the nature of a cluster. That is, some metrics are symptomatic of a more significant identifier. We make explicit use of other metrics that are more likely to help define a cluster, such as the amount of density (likelihood of friendship) between users in a cluster, the amount of common influence they share, etc.

2. The ways in which people are the same

The question of similarity between people, and between groups, is often a question of scope. To a Martian, someone in a Yankees cap and another in a Red Sox cap are just people with an affinity for hats—basically the same. To a Cubs fan, they’re both distinct (“Oh, I bet they can’t stand one another”), and the same (“American League, East-Coast baseball teams”). In those cap-wearers’ own minds, they may very well be mortal enemies.

The things that make us similar on social networks are worth evaluating critically:

  • The content we share.
  • The way we self-describe (our Twitter bios, for example).
  • The things we like/favourite/pin.
  • The people we follow.

The Yankees fan and the Red Sox fan above might both follow @mlb. They might even both follow @espn. A huge part of what we do at Affinio is find out when a following-pattern is really a pattern, and distinguish that from when something’s more of a global trend. If something’s too homogeneous, we tend to flag it as noise, and downgrade its importance (there would not be very much insight to be garnered if I told you an abundance of American Twitter users follow @BarackObama or @KatyPerry.) Helping you see the signal through the noise is a big part of our job.

“Well-designed networks reduce friction and help good stuff be found. Connections allow the whole to become greater than the sum of the parts and allow new paths to discover and build meaning.” – Ev Williams

One of the reasons we consider following patterns to be paramount to social analysis is that interacting (i.e. following) with other people is a fundamental property of social behaviour. Identify your friends and interests, and establish a relationship with them: this is the minimum act that must take place on networks like Facebook, Twitter, and Instagram. This is not done out loud, where social biases might keep individuals from being honest about what they talk about in person (While I do follow some political candidates, good luck finding a tweet where I talk about my political leanings.); rather it is done silently and by default while building your personal network.

These silent patterns play a role in analysis at many levels, but to demonstrate their importance, consider the following: 61% of online Millennials get their political news on Facebook, compared to 37% from TV. This is big: it means that young people get a say in what influences them, since they follow/friend people of their own choice.

In short, when building their network, people also build an interest graph. That interest graph can be analyzed and monitored for quantifiable trends, the depth of which provides a much more holistic picture of any social behaviour than polling or demographic insights alone are able to do.