Skip to content

Author: Joni

Joni holds a PhD in marketing. He is currently working as a postdoctoral researcher at Qatar Computing Research Institute and Turku School of Economics. Contact: joolsa (at) utu.fi

How to write emails that get read? 11 tips I use daily.

Here are some tips to make people more likely to read your email.

I’ve noticed that some people struggle to communicate effectively via email, so maybe sharing these tips will help someone.

Tips:

1. include *one message* per email — when you include 2 or more, the others easily get ignored. It’s better to send a new message, like “ps. one more thing…”

2. don’t make people think why you move from A to B, but make it evident from the text. Like, make a logical argument that explains itself. Find supporting evidence when needed and be truthful to yourself.

3. use short sentences, short paragraphs — people are scanning so shortness sells.

4. use plain words, don’t make people think

5. use words and phrases that cannot be misunderstood

6. be personal, use people’s names to catch their attention

7. use bolding and lists to facilitate scanning — in text-only, use *asterisk symbols* to emphasize

8. include the next steps — too many emails end up in a limbo, like what should I do after reading it?

Moreover,

9. do the thinking for the reader, so it’s easy to take action. Sometimes this means writing a single email can take an hour or more.

10. include all the relevant people when forwarding or replying — maximum transparency, maximum information

11. however, when you want a specific response, send your message individually. For example, don’t send survey links as mass-posting; approach people personally.

Got more tips? Share!

Medieval Game of Death (#bisnesidea)

korttipeli, jossa yritetään selvitä keskiajan kylän läpi.

A

plague – a rat bites you and infests you with plague
death
guillotine
alchemist
mermaid
hellhound
inquisition
witchcraft
burning at stake
lonely knight
harlequin

B

map:
back alley
market place
castle
tavern
underworld
gates of heaven

A+B –> jokaisessa oma tarina

“travel through a medieval town and see if you can survive”

googleta: card game mechanics

1. throw a dice
2. take a card
3. ???

vrt. tallinn legends
–> samantyylinen kaveri vetämään videon äänispiikin

“No Startup Is an Island” – How to Use Network Pictures to Position Yourself in the Market

I got introduced to network pictures by Valtteri Kaartemo a few years back, and thought it was a cool idea. Since then, I’ve realized — after talking to many startups — that it’s more than a cool idea. It’s actually useful.

That’s because startups routinely overlook their networks and just focus on competitors. They make positioning to competitors, not to collaborators. This can be very detrimental to succeeding, because often the most connected startups do the best: they get the biggest investment rounds, biggest sales deals, etc. They are just liked more.

So you need to network. And using network pictures can help.

What is a network picture?

The idea of a network picture is that you draw your business as a network diagram (i.e., you in the middle as the central node, and other players linked to you as first- or second-degree nodes).

An example of a network picture (source: [1]):

The others in the picture can be any parties with a logical relation to your startup. They can be:

  • collaborators
  • customers
  • investors
  • suppliers/vendors
  • resellers/distributors
  • marketing/business development agencies
  • freelancers
  • friends and family
  • research institutes/universities
  • state departments
  • entreprenurship societies
  • corporations with venture programs
  • press/media
  • associations/non-profits.

They can be companies or individual people (e.g., influencers, decision makers, etc.).

Basically, those are the actors that your business interacts with (or should interact with). You are not an island.

How do use network pictures for your startup?

Now, the important thing is this: you first draw the current situation, and then the vision. I repeat,

1. Draw current situation

2. Draw vision

3. Compare the two

The point is that when drawing the vision, you automatically make apparent your desired state of mind which makes it easier to create a tangible plan for networking. It’s about making the vision explicit.

It also helps you consider possibilities that you had overlooked. Like, “Oh, we should check if the local university has any research projects that coincide with our product development roadmap“. Or, “We could meet up with the industry association people to ask if they find potential in our tech.” Things like that.

Through this process, you (hopefully) realize that you’re not an island, and that there many parties you could (and should) involve in your business at varying degrees of commitment. You can continue by analyzing the motives and win-wins that your connections to current and future parties entail. A good approach has been illustrated in [2]:

Conclusion

Often, business planning for startups focuses on competitors, but collaborators can be even more important. Start by drawing them, and then make the connections happen.

If you’re a startup founder and haven’t thought about the importance of networks, you should. There is research that shows networks and connections matter — and common sense supports this argument, too. You are acting in an ecosystem of other players. It’s the market, not your garage, that matters.

Footnotes:

[1] Kaartemo, V. (2013). Network development process of international new ventures in internet-enabled markets: service ecosystems approach (Doctoral dissertation). Turku School of Economics, Turku, Finland. Retrieved from http://tsenet.fi/wp-content/uploads/2013/11/network-development-process-of-international-new-ventures.pdf

[2] Valjakka T., Kaartemo V., Valkokari K. (2017) Making Sense of Network Dynamics through Network Picturing. In: Vesalainen J., Valkokari K., Hellström M. (eds) Practices for Network Management. Palgrave Macmillan, Cham

Use Cases for Personas

This is a joint piece by Dr. Joni Salminen and Professor Jim Jansen. The authors are working on a system for automatic persona generation at the Qatar Computing Research Institute. The system is available online at https://persona.qcri.org.

Introduction

Personas are fictive characterizations of the core audience or customers of a company, introduced into software development and marketing in the 1990s (see Jenkinson, 1994; Cooper, 1999). Personas capture and summarize key elements of key customer segments so that decision makers could better understand their audience or customers, not just by using numbers but also referring to qualitative attributes, such as key pain points and desires, needs and wants. We refer to persona creation as “giving faces to data,” as personas are ideally based on real data on customer behavior. Figure 1 shows an example of a data-driven persona in which the attributes are inferred automatically from social media data.

Figure 1: Data-driven persona.

While personas have been argued to have many benefits in the academic literature (see e.g., Nielsen, 2004; Pruitt & Grudin, 2003; Salminen et al., 2017), we are constantly facing the same questions from new client organizations wishing to use our system for automatic persona generation (APG) (An et al., 2017). Namely, they want to know how to use personas in practice. While we often make the analogy that personas are like any other analytics system, meaning that the use cases depend on the client’s information needs (i.e., what they want to know about the customers), this answer is still a bit puzzling to them.

For that reason, we decided to write this piece outlining some key use cases for personas. These are meant as examples, as the full range of use cases is much wider. We will first explore some general use cases, and then proceed to elaborate on more specific persona use cases by different organizational units.

General Use Cases of Personas

In general, there are three main purposes personas serve:

1) Customer Insights. This deals with getting to know your core audience, users or customers better. For example, APG enables an organization to understand its audience’s topics of interest and preferred social media content. Who uses? Everyone in the organization.

2) Creation Activities. Using persona information to create better products, content, marketing communication, or other outputs. Who uses? Everyone in the organization dealing with customer-facing outputs.

3) Communication. Using personas for communication across departments. While it is difficult to discuss a spreadsheet, it is much easier to communicate about a person. Sharing the persona work across divisions thus increases the chance for realization of benefits. Personas make data communicable and keep team members focused on the customer needs. Who uses? Everyone in the organization.

Specific Use Cases of Personas

In addition to shared use cases of personas, there are more specific use cases. For example, product managers can use the information to design a product that meets the needs or desires of core customers, and marketing can use personas to craft messages that resonate. Here, we are outlining specific examples of use cases within organizational units. More specifically, we allocate these use cases under four sections.

1) Customer Insights and Reporting

Journey Mapping: Plot the stages and paths of the persona lifecycle, documenting each persona’s unique state of mind, needs and concerns at each stage. Understand your website visitors’ customer journey.

Persona Discovery: Document the individuals involved in the purchase process in a way that allows decision makers to empathize with them in a consistent way.

Brand Discovery: Uncover how your core customers feel about your product or service and how they rationalize their purchase decisions.

Reporting and Feedback: Report and review data and insights to drive strategic decisions, as well as provide information to the organization as a whole.

2) Creation Activities

Planning Product Offerings: With the help of personas, organizations can more easily build the features that suit their customers’ needs. Consider the goals, desires, and limitations of core customers to guide feature, interface, and design choices.

Role Playing: Personas help product developers “get into character” and understand the circumstances of their users. They facilitate genuine understanding of the thoughts, feelings, and behaviors of core customers. Individuals have a natural tendency to relate to other humans, and it’s important to tap into this trait when making design and product development choices.

Content Creation: Content creators can leverage personas for delivery of content that will be most relevant and useful to their audience. When planning for content, we might ask “Would Jamal understand this?” or “Would Jamal be attracted by this?” Personas help one determine what kind of content is needed to resonate with core customers and in which tone or style to deliver the content. Naturally, customer analytics can and should be used to verify the results.

3) Persona Experimentation

Channel and Offering Alignment: Align every piece of offerings and marketing activity to a persona and purchase stage, identifying new channels and needs where opportunities exist.

Prediction of Popularity: Predict how a given persona will react to content, marketing messages, or products. This is a particular advantage of data-driven personas that enable using the underlying topical interests of the persona to model the likely match between personas and a given content unit.

Experimentation and Optimization: Carry out well-thought experiments with personas to produce statistically valid business insights and apply the results to optimize performance. For example, you could run Facebook Ads campaigns targeting segments corresponding to the core personas and analyze whether the campaigns perform better than broader or other customer segments.

4) Strategic Decision Making

Strategic Marketing: When you understand where your core customers spend their time online, you are able to focus your marketing spend on these channels. For example, if the data shows that your core customers prefer YouTube over Facebook, you can increase your marketing spend in the former. Think how you might describe your product for this particular type of person. For example, would Bridget better understand your offering as a “social media service” or as an “enterprise customer management tool”? Depending on the answer, the communicative strategy would be different.

Sales Strategies: Targeted offerings can help organizations convert more potential customers to subscribers, followers and customers. You can also use personas to tailor lead generation strategies which is likely to improve your lead quality and performance. By approaching your messages from a human perspective, you can create sales and marketing communication that is tailored to your core customers and, therefore, is likely to perform better.

Executives: Key decision makers can keep personas in mind while making strategic decisions. In fact, a persona can become a “silent member in the boardroom,” evoked to question the customer impact of the considered decisions.

Examples for the APG system

In the following, we will include some use case examples from the APG system that generates personas automatically from online analytics and social media data. The system is currently fully functional, and we are accepting a limited number of new clients with free of charge research licenses. See the end of this post for more details.

Figure 2: This functionality enables the client to generate personas from his chosen data source (currently, following platforms are supported: YouTube, Facebook, Google Analytics). The client can choose between 5 and 15 personas.

Figure 3: The persona profile shows detailed information about the persona. It enables human-oriented customer insights.

Figure 4: This feature enables an easy comparison of the personas across their key attributes. Improves understanding of the core customer segments.

Figure 5: This feature shows which personas most often react with which individual content.

Figure 6: This feature shows how the interests and other information of the personas change over time. Currently, APG generates new personas on a monthly basis.

Figure 7: This feature enables a gap analysis of the current audience and potential audience. The statistics are retrieved from actual audience data of the organization and the corresponding Facebook audience (via Facebook Marketing API).

Conclusion

Forrester Research (2010) reports a 20% productivity improvement with teams that use personas. Yet, using personas is not always straight-forward. Ultimately, the exact use cases depend on the client’s information needs. These needs can best be found by collaborating with persona creators to provide tailored personas that are useful specifically for a given organization in their practical decision making.

Through means of “co-creation,” clients and persona creators can figure out together how the personas could be useful for real usage scenarios. According to our experience, useful questions for defining the client’s information needs include:

  • What are your objectives for content creation / marketing?
  • What kind of customer-related decisions you make?
  • What kind of customer information you need?
  • What analytics information are you currently using?
  • What kind of customer-related questions you don’t currently get good answers to?
  • How would you use personas in your own work?
  • What information you find useful in the persona mockup?
  • What information is missing from the persona mockup?

If you are interested in the possibilities of automatic persona generation for your organization, don’t hesitate to contact us! Professor Jim Jansen will gladly provide more information: [email protected]. However, please note that for automatic persona generation to be useful for your organization, you need to have at least hundreds (preferably thousands) of content pieces published online with a wide audience viewing them. APG is great at summarizing complex audiences, but if you don’t have enough data, persona generation is better done via manual methods.

References

An, J., Haewoon, K., & Jansen, B. J. (2017). Personas for Content Creators via Decomposed Aggregate Audience Statistics. In Proceedings of Advances in Social Network Analysis and Mining (ASONAM 2017). http://www.bernardjjansen.com/uploads/2/4/1/8/24188166/jansen_personas_asonam2017.pdf

Cooper, A. (1999). The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity (1 edition). Indianapolis, IN: Sams – Pearson Education.

Forrester Research. (2010). The ROI Of Personas. Retrieved from https://www.forrester.com/report/The+ROI+Of+Personas/-/E-RES55359

Jenkinson, A. (1994). Beyond segmentation. Journal of Targeting, Measurement and Analysis for Marketing, 3(1), 60–72.

Nielsen, L. (2004). Engaging personas and narrative scenarios (Vol. 17). Samfundslitteratur. Retrieved from http://personas.dk/wp-content/samlet-udgave-til-load.pdf

Pruitt, J., & Grudin, J. (2003). Personas: Practice and Theory. In Proceedings of the 2003 Conference on Designing for User Experiences (pp. 1–15). New York, NY, USA: ACM.

Salminen, J., Sercan, Ş., Haewoon, K., Jansen, B. J., An, J., Jung, S., Vieweg, S., and Harrell, F. (2017). Generating Cultural Personas from Social Data: A Perspective of Middle Eastern Users. In Proceedings of The Fourth International Symposium on Social Networks Analysis, Management and Security (SNAMS-2017). Prague, Czech Republic. Available at http://www.bernardjjansen.com/uploads/2/4/1/8/24188166/jansen_mena_personas2017.pdf

Tips on Data Imputation From Machine Learning Experts

Missing values are a critical issue in statistics and machine learning (which is “advanced statistics”). Data imputation deals with ways to fill those missing values.

Andriy Burkov made this statement a few days ago [1]:

“The best way to fill a missing value of an attribute is to build a classifier (if the attribute is binary) or a regressor (if the attribute is real-valued) using other attributes as “X” and the attribute you want to fix as “y”.”

However, the issue is not that simple. As noted by one participant:

From Franco Costa, Developer: Java, Cloud, Machine Learning:

What if is totally independent from the other features? Nothing to learn

The discussion then quickly expanded and many machine learning experts offered their own experiences and tips for solving this problem. At the time of writing (March 8, 2018), there are 69 answers.

Here are, in my opinion, the most useful ones.

1) REMOVE MISSING VALUES

From Blaine Bateman, EAF LLC, Founder and Chief Data Engineer at EAF LLC:

Or just drop it from the predictors

From Swapnil Gaikwad, Software Engineer Cognitive Computing (Chatbots) at Light Information Systems Pvt. Ltd.:

Also I got an advice from one of my mentor is whenever we have more than 50% of the missing values in a column, we can simply omit that column (if we can), if we have enough other features to build a model.

2) ASK WHY

From Kevin Gray, Reality Science:

It’s of fundamental importance to do our best to understand why missing data are missing. Two excellent sources for an in-depth look at this topic are Applied Missing Data Analysis (Enders) and Handbook of Missing Data Methodology (Molenberghs et al.). Outlier Analysis (Aggarwal) is also relevant. FIML and MI are very commonly used by statisticians, among other approaches.

From Julio Bonis Sanz, Medical Doctor + MBA + Epidemiologist + Software Developer = Health Data Scientist:

In some analysis I have done in the past, including “missing” as a value for prediction itself have got some interesting results. The fact that for a given observation that value is missing is sometimes associated with the outcome you want to predict.

From Tero Keski-Valkama, A Hacker and a Machine Learning Generalist:

Also, you can try to check if the value being missing encodes some real phenomenon (like the responder chooses to skip the question about gender, or a machine dropping temperature values above a certain threshold) by trying to train a classifier to predict whether a value would be missing or not. It’s not always the case that values being missing are just independent random noise.

From Vishnu Sai, Decision Scientist at Mu Sigma Inc.:

In my experience, I’ve found that the technique for filling up missing values depends on the business scenario.

From David T. Kearns, Co-founder, Sustainable Data and Principal Consultant, Sustainable Services:

I think it’s important to understand the underlying cause of the missing values. If your data was gathered by survey, for example, some people will realise their views are socially unpopular and will keep them to themselves. You can’t just average out that bias – you need to take steps to reduce it during measurement. For example, design your survey process to eliminate social pressure on the respondent.

For non-human measurements, sometimes instruments can be biased or faulty. We need to understand if those biases/faults are themselves a function of the underlying measurements – do we lose data just as our values become high or low for example? This is where domain knowledge is useful – making intelligence decisions of what to do, not blind assumptions.

If you’ve done all that and still have some missing values, then you’ll be in a far stronger position to answer your question intelligently.

3) USE MISSING VALUES AS A FEATURE

From Julio Bonis Sanz, Medical Doctor + MBA + Epidemiologist + Software Developer = Health Data Scientist:

One of my cases was a predictive model of use of antibiotics by patients with chronic bronchitis. One of the variables was smoking with about 20% of missing values. It turned out that having no information in the clinical record about smoking status was itself a strong predictor of use of antibiotics because a patient missing this data were receiving worse healthcare in general. By using imputation methods you someway lose that information.

From Kirstin Juhl, Full Stack Software Developer/Manager at UnitedHealth Group:

Julio Bonis Sanz Interesting- something that I wouldn’t have thought of – missing values as a feature itself.

From Peter Fennell, Postdoctoral Researcher in Statistics and A.I. @ USC:

Thats fine if have one attribute with missing values. Or two. But what if many of your features have missing values? Do recursive filling, but that can lead to error propagation? like to think that there is value in missing value, and so giving them their own distinct label (which, eg, a tree based classifier can isolate) can be an effective option

4) USE TESTED PACKAGES SUCH AS MICE OR RANDOM FOREST

From Jehan Gonsal, Senior Insights Analyst at AIMIA:

MCMC methods seem like the best way to go. I’ve used the MICE package before and found it to be very easy to audit and theoretically defensible.

From Swapnil Gaikwad, Software Engineer Cognitive Computing (Chatbots) at Light Information Systems Pvt. Ltd.:

This is a great advice! In one of my projects, I have used the R package called MICE which does the regression to find out the missing values. It works much better than the mean method.

From Nihit Save, Data Analyst at CMS Computers Limited (INDIA):

Multivariate Imputation using Chained Equation (MICE) is an excellent algorithm which tries to achieve the same. https://www.r-bloggers.com/imputing-missing-data-with-r-mice-package/

From ROHIT MAHAJAN, Research Scholar – Data Science and Machine Learning at Aegis School of Data Science:

In R there are many packages like MICE, Amelia and most Important “missForest” which will do this for you. But it takes too much time if data is more than 500 Mb. I always follow this regressor/classifier approach for most important attributes.

From Knut Jägersberg, Data Analyst:

Another way to deal with missing values in a model based manner is by using random forests, which work for both categorical and continuous variables: https://github.com/stekhoven/missForest . This algorithm can be easily be reimplemented with i.e. a faster than in R implemented RF algorithm such as ranger (https://github.com/imbs-hl/ranger) and then scales well to larger datasets.

5) USE INTERPOLATION

From Sekhar Maddula, Actively looking for Data-science roles:

Partly agree Andriy Burkov. But at the same time there are few methods specific to the technique/algo. e.g. For Time-series data, you may think of considering interpolation methods available with the R package “imputeTS”. I also hope that there are many Interpolation methods in the field of Mathematics. We may need to try an appropriate one.

6) ANALYZE DISTRIBUTIONS

From Gurtej Khanooja, Software Engineering/Data Science Co-op at Weather Source| Striving for Excellence:

One more way of dealing with the missing values is to identify the distribution using remaining values and fill the missing values by randomly filling the values from the distribution. Works fine in a lot of cases.

From Tero Keski-Valkama, A Hacker and a Machine Learning Generalist:

If you are going to use a classifier or a regressor to fill the missing values, you should sample from the predicted distribution rather than just picking the value with the largest probability.

SUMMARY

This was the best summary comment I found:

“I have used MICE package in R to deal with imputing and luckily it produced better results. But in general we should take care of the following:

  1. Why the data is missing? Is is for some meaningful reason or what?
  2. How much data is missing?
  3. Fit a model with non missing values.
  4. Now apply the imputing technique and fit the model and compare with the earlier one.”

-Dr. Bharathula Bhargavarama Sarma, PhD in Statistics, Enthusiastic data science practitioner

FOOTNOTES

[1] https://www.linkedin.com/feed/update/urn:li:activity:6375893686550614016/

Creating Buyer Personas: Common Interview Questions

Introduction

At Qatar Computing Research Institute (QCRI), we are developing a system for automatic persona generation (APG). The demo is available online at https://persona.qcri.org

As a part of this research, we’re interested in the information needs of end users of personas [1]. People working in different domains are interested in different information, after all. For example, journalists want to know what type of news the personas are consuming, while e-commerce marketers want to know what products they are buying.

We have reviewed a lot of material relating to interviewing customers in order to create the persona profiles because, although our approach is based on automation and computational techniques, we have an interest to experiment with mixed personas utilizing qualitative data to enrich the automatically generated personas [2].

This brief post shares some of the key insights we’ve found.

Persona Information

In general, when creating personas we need to query two types of information:

  1. Information needs of persona users => this means what information people inside our organization want to know
  2. Customer information => this means what information we can learn about the customers

For the former, we have developed an Information Needs Questionnaire with eight questions:

  1. What are your objectives for content creation / marketing?
  2. What kind of customer-related decisions you make?
  3. What kind of customer information you need?
  4. What analytics information are you currently using?
  5. What kind of customer-related questions you don’t currently get good answers to?
  6. How would you use personas in your own work?
  7. What information you find useful in the persona mockup?
  8. What information is missing from the mockup?

The purpose of these questions is to discover the interviewee’s professional information needs. This is useful for developing analytics systems, e.g. automatic persona generation, but also extends to traditional persona creation.

In the following, we summarize some questions intended for customers.

From Mr. Steve Cartwright (2015) [3]

“I know that when I am preparing buyer personas I have a whole heap of questions that I ask in fact I have a PowerPoint I go through with clients, this enables me to generate the personas that I need. However, if you start by simply asking:

·      Who are they?

·      What do these people do?

·      Are they married, singles, living with a partner?

·      What problems or concerns do they have, that your industry niche can solve?

·      Where do they hang out and what do they do online?

·      Are these people decision makers, influencers or referral sources?

Just those six questions are all you need to get started and to start to understand who you’re customers are and to turn your business into a customer centric one.”

***

From “Nisha” (2013) [4]:

“Questions for B2B marketers to delve into while creating buyer personas include:

  • Buyer experience and reporting officer of the prospect
  • Professional background of the prospect
  • Kind of organization
  • Organizations’ segment focus
  • History of purchases
  • Change in role in past few years
  • Market forces influencing buyers
  • Most urgent problems
  • What funded initiatives does the buyer have
  • What are the motivations that drive the buyer
  • What the buyer’s needs?
  • What is the budget?
  • Who are involved in the decision-making?
  • Attitude of the company towards the product/service

***

From Jesse Ness [5] (2016):

“Demographic questions:

These are the most basic questions that you should be asking your target customers, such as:

·      Are they married?

·      How old are they?

·      Where do they live?

·      Do they have children? How many? What ages?

·      Which country/city did they grow up in?

·      Education questions:

Our early school and college education help us shape as adults. People usually tend to answer these questions more honestly.

·      What level of education did they complete?

·      Which schools did they attend? Public or Private?

·      What did they study?

·      Were they popular at school?

·      Which extra-curricular activities (if any) did they take part in?

·      Career questions:

Questions about the working life of your prospects reveals a lot of interesting details about them.

·      What industry do they work in?

·      What is their current job level?

·      What was their first full-time job?

·      How did they end up where they are today?

·      Has their career track been traditional or did they switch from another industry?

·      Financial questions:

Your customers finances will tell you what they can afford and how easily they make their purchasing decisions.

·      How often you buy high ticket items?

·      How much are they worth?

·      Are they responsible for making purchasing decision in the household?

Keep in mind that people tend to answer financial questions incorrectly, even in anonymous online surveys. Some might even construe this as an invasion of their privacy. Temper your results accordingly (usually by decreasing the stated average income).”

Conclusion

There is a myriad of questions one can ask from the customers when creating persona profiles. However, they should be based on first defining internal information needs. In the persona creation process, the above question lists serve as inspiration.

Interested in automatic persona generation for your company? Contact Dr. Jim Jansen: [email protected]

Footnotes

[1] Personas are fictive characters based on real data about the underlying audience. Their purpose is to make customer analytics more easily understandable than numbers and graphs.

[2] Salminen, J., Şengün, S., Haewoon, K., Jansen, B. J., An, J., Jung, S., … Harrell, F. (2017). Generating Cultural Personas from Social Data: A Perspective of Middle Eastern Users. In Proceedings of The Fourth International Symposium on Social Networks Analysis, Management and Security (SNAMS-2017). Prague, Czech Republic.

[3] https://website-designs.com/online-marketing/content-marketing/buyers-personas-allow-you-to/

[4] https://www.xerago.com/blog/2013/08/why-buyer-personas-are-not-the-same-as-customer-profiling/

[5] https://www.ecwid.com/blog/how-to-create-buyer-personas-for-an-ecommerce-store.html

Google Analytics: 21 Questions to Get Started

I was teaching a course called “Web & Mobile Analytics” at Aalto University back in 2015.

As a part of that course, the students conducted an analytics audit for their chosen website. I’m sharing the list of questions I made for that audit, as it’s a useful list for getting to know Google Analytics.

The questions

Choose a period to look at (e.g., the past month, three months, last year, this year… generally, the longer the better because it gives you more data). Answer the questions. The questions are organized by sections of Google Analytics.

a. Audience

  • How has the website traffic evolved during the period your inspecting? How does the traffic from that period compare to earlier periods?
  • What are the 10 most popular landing pages?
  • What are the 10 pages with the highest bounce rate AND at least 100 visits in the last month? (Hint: advanced filter)

b. Behavior

  • How does the user behavior differ based on the device they’re using? (Desktop/laptop, mobile, tablet)
  • Where do people most often navigate to from the homepage?
  • How do new and old visitors differ by behavior?
  • What is the general bounce rate of the website? Which channel has the highest bounce rate?
  • How well do the users engage with the website? (Hint: Define the metrics you used to evaluate engagement.)
  • Is there a difference in engagement between men and women?

c. Acquisition

  • How is the traffic distributed among the major sources?
  • Can you find performance differences between paid and organic channels?
  • Compare the goal conversion rate of different marketing channels to the site average. What can you discover?

d. Conversion

  • What is the most profitable source of traffic?
  • What is the best sales (or conversion, based on the number of conversions) month of the year? How would you use this information in marketing planning?
  • Which channels or sources seem most promising in terms of sales potential? (Hint: look at the channels with high CVR and low traffic)
  • Analyze conversion peaks. Are there peaks? Can you find explanation to such peaks?
  • Can you find sources that generated assisted conversions? Which sources are they? Is the overall volume of assisted conversions significant?
  • Does applying another attribution model besides the last click model alter your view on the performance of marketing channels? If so, how?

e. Recommendations

  • Based on your audit, how could the case company develop its digital marketing?
  • How could the case company’s use of analytics be developed? (E.g., what data is not available?)
  • What other interesting discoveries can you make? (Mention 2–5 interesting points.)

Answering the above questions provides a basic understanding of a goal-oriented website’s situation. In the domain of analytics, asking the right questions is often the most important (and difficult) thing.

The dashboard

In addition, the students built a dashboard for the class. Again, the instructions illustrate some useful basic functions of Google Analytics.

Build a dashboard showing the following information. Include a screenshot of your dashboard in the audit report.

Where is the traffic coming from?

  • breakdown of traffic by channel

What are the major referral sources?

  • 10 biggest referral sites

How are conversions distributed geographically?

  • 5 biggest cities by conversions

How is Facebook bringing conversions?

  • Product revenue from Facebook as a function of time

Are new visitors from different channels men or women?

  • % new sessions by channels and gender

What keywords bring in the most visitors and money?

  • revenue and sessions by keyword

If you see fit, include other widgets in the dashboard based on the key performance metrics of your company.

Conclusion

Reports and dashboards are basic functions of Google Analytics. More advanced uses include custom reports and metrics, alerts, and data importing.

Simple methods for anomaly detection in e-commerce

Anomaly is a deviation from the expected value. The main challenges are: (a) how much the deviation should be to be classified as an anomaly, and (b) what time frame or subset of data should we examine.

The simplest way to answer those questions is to use your marketer’s intuition. As an e-commerce manager, you have an idea of how big of an impact constitutes an anomaly for your business. For example, if sales change by 5% in a daily year-on-year comparison, that would not typically be an anomaly in e-commerce, because the purchase patterns naturally deviate this much or even more. However, if your business has e.g. a much higher growth going on and you suddenly drop from 20% y-o-y growth to 5%, then you could consider such a shift as an anomaly.

So, the first step should be to define which metrics are most central for tracking. Then, you would define threshold values and the time period. In e-commerce, we could e.g. define the following metrics and values:

  • Bounce Rate – 50% Increase
  • Branded (Non-Paid) Search Visits – 25% Decrease
  • CPC Bounce – 50% Increase
  • CPC Visit – 33% Decrease
  • Direct Visits – 25% Decrease
  • Direct Visits – 25% Increase
  • Ecommerce Revenue – 25% Decrease
  • Ecommerce Transactions – 33% Decrease
  • Internal Search – 33% Decrease
  • Internal Search – 50% Increase
  • Non-Branded (Non-Paid) Search Visits – 25% Decrease
  • Non-Paid Bounces – 50% Increase
  • Non-Paid Visits – 25% Decrease
  • Pageviews – 25% Decrease
  • Referral Visits – 25% Decrease
  • Visits – 33% Decrease

As you can see, this is rule-based detection of anomalies: once the observed value exceeds the threshold value in a given time period (say, daily or weekly tracking), the system alerts to e-commerce manager.

The difficulty, of course, lies in defining the threshold values. Due to changing baseline values, they need to be constantly updated. Thus, there should be better ways to detect anomalies.

Another simple method is to use a simple sliding window algorithm. This algorithm can (a) update the baseline value automatically based on data, and (b) identify anomalies based on a statistical property rather than the marketer’s intuition. The parameters for such an algorithm are:

  • frequency: how often the algorithm runs, e.g. daily, weekly, or monthly. Even intra-day runs are possible, but in most e-commerce cases not necessary (exception could be technical metrics such as server response time).
  • window size: this is the period for updating. For example, if the window size is 7 days and the algorithm is run daily, it computes that data always from the past seven days, each day adding +1 to start and end date.
  • statistical threshold: this is the logic of detecting anomalies. A typical approach is to (a) compute the mean for each metric during window size, and (b) compare the new values to mean, so that a difference of more than 2 or 3 standard deviations from the mean indicates an anomaly.

Thus, the threshold values automatically adjust to the moving baseline because the mean value is re-calculated at each window size.

How to interpret anomalies?

Note that an anomaly is not necessarily a bad thing. Positive anomalies occur e.g. when a new campaign kicks off, or the company achieves some form of viral marketing. Anomalies can also arise when a season breaks in. To mitigate such effects from showing, one can configure the baseline to represent year-on-year data instead of historical data from the current year. Regardless of whether the direction of the change is positive or negative, it is useful for a marketer to know there is a change of momentum. This helps restructure campaigns, allocate resources properly, and become aware of the external effects on key performance indicators.

Platform metrics: Some ideas

I was chatting with Lauri [1] about platform research. I claimed that the research has not that many implications for real-world companies apart from the basic constructs of network effects, two-sidedness, tipping, marquee users, strategies such as envelopment, and of course many challenges, including chicken-and-egg problems, monetization dilemma, and remora’s curse (see my dissertation on startup dilemmas for more insights on those…).

But then I got to think that metrics are kind of overlooked. Most of the platform literature comes from economics and is very theoretical and math-oriented. Yet, it’s somehow apart from practical Web analytics and digital metrics. Those kind of metrics, however, are very important for platform entrepreneurs and startup founders.

On the face of it, it seems that the only difference that platforms have compared to “normal” businesses is their two-sidedness. If we have supply side (Side A) and demand side (Side B), then the metrics could be just identical for each and the main thing is to keep track of the metrics for both sides.

However, there are some dynamics at play. The company has one goal, one budget, one strategy, at least typically. That means those metrics, even though can be computed separately, are interconnected.

Here are some examples of platform metrics:

  • Number of Users/Customers (Side A, Side B)
  • Revenue (Side A, Side B)
  • Growth of Revenue (Side A, Side B)
  • Cross-correlation of Number of Users and Revenue (e.g., Side A users => Side B revenue)
  • Cost per User/Customer Acquisition (Side A, Side B)
  • Support cost (Side A, Side B)
  • Average User/Customer Lifetime (Side A, Side B)
  • Average Transaction Value (Side A, Side B)
  • Engagement Volume (Side A, Side B)
  • Profitability distribution (Side A, Side B)

Note the cross-correlation example. Basically, all the metrics can be cross-correlated to analyze how different metrics of each side affect each other. Moreover, this can be done in different time periods to increase robustness of the findings. Such correlations can reveal important information about the dynamics of network effects and tell, for example, whether to focus on adding Side A or Side B at a given point in time. A typical example is solving the cold start problem by hitting critical mass, i.e., the minimum number of users required for network effects to take place (essentially, the number of users needed for the platform to be useful). Before this point is reached, all other metrics can look grim; however, after reaching that point, the line charts of other metrics should turn from flat line to linear or exponential growth, and the platform should ideally become self-sustainable.

Basic metrics can also be used to calculate profitability, e.g.

Average Transaction Value x Average Customer Lifetime > Cost per Customer Acquisition

Business models of many startup platforms are geared towards “nickel economics,” meaning that the average transaction values are very low. In such situations, the customer acquisition cost has to be low as well, or the frequency of transactions extremely high. When these rules are violated, the whole business model does not make sense. This is partly because of competitive nature of the market, requiring sizable budgets for actual user/customer acquisition. For platforms, the situation is even more serious than for other businesses because network effects require the existence of critical mass that costs money to achieve.

In real world, customer acquisition cost (CPA) cannot usually be ignored, apart from few outliers. The CPA structure might also differ between the platform sides, and it is not self-evident what type of customer acquisition strategies yield the lowest CPAs. In fact, it is an empirical question. A highly skilled sales force can bring in new suppliers at a lower CPA than a digital marketer that lacks the skills or organic demand. Then again, under lucrative conditions, the CPA of digital advertising can be minuscule compared to sales people due to its property of scaling.

However, as apparent from the previous list, relationship durations also matter. For example, many consumers can be fickle but supplier relationships can last for years. This means that suppliers can generate revenue over a longer period of time than consumers, possibly turning a higher acquisition cost into more a more profitable investment. Therefore, churn of each side should be considered. Moreover, there are costs associated with providing customer support for each side. As Lauri noted based on his first-hand experience working for a platform company, the frequency and cost per customer encounter differ vastly by side, and require different kind of expertise from the company.

In cases where the platform has an indirect business model, also called subvention because one side is subventing the cost of the other, the set of metrics should be different. For example, if only Side B (paid users) is paying the platform but is doing so because there is Side A (free users), Side B could be monitored with financial metrics and Side A with engagement metrics.

Finally, profitability distribution refers to uneven distribution of profitable players in each market side. This structure is important to be aware of. For example, in e-commerce it is typical that there are a few “killer products” that account for a relatively large share of sales value, but the majority of the total sales value is generated by hundreds or thousands of products with small individual sales. Understanding this dynamics of adding both killer products and average products (or complements, in using platform terms) is crucial for managing the platform growth.

Footnotes:

[1] Lauri Pitkänen, my best friend.