Caution!!! BigData S.L.I.P.S.: five tips when using analytics


BigData_SLIPS

Along my brief research on BigData, I’ve found 5 type of S.L.I.P.S that a data scientist might encounter along the way: Statistic, Learning, Information, Psychology and Sources.

1) Statistic (Left Foot)

Is without any doubt the main and well-known technical aspect. The most common slip concerning statistic is misleading correlation with causation. In other words, discovering correlations among variables doesn’t necessarily imply a cause-effect relation. Mathematically speaking, correlation is a necessary but not sufficient condition for a cause-effect relationship.

(see also K. Borne: Statistical Truisms in the Age of BigData).

2) Learning (Right Foot)

OK, lets assume that a cause-effect relationship exists: which model\algorithm to chose in order to describe the relationship? There are many: ARMA, Kalman’s Filter, Neural Networks, customized,… which one fits best? A model that has been validated with the data available now might be not valid anymore in the future. So, constantly monitoring and measure the error of prediction with the estimated values by the model.

Choosing a model implies making assumptions. In other words, never quit to learn from data and be open to break assumptions otherwise predictions and analysis will be slanted.

3) Information (Right Hand)

Which information is really meaningful? That’s the first point to clarify before implementing a bigdata initiative or any new BI tool for your business.

Another point is misleading information with data. According to information theory, and a well-grounded common sense as well, data are facts while information is an interpretation of facts based upon assumptions (see also the D.A.I. model).

(see also: D. Laney & M. Beyer: BigData Strategy Essentials for Business and IT).

4) Psychology (the Head… of course!)

Have you ever heard about eco-chamber effects and social influence? Well, what happen is that social media might amplify irrational behaviours where individuals (me included) base its decisions, more or less consciously, not only on their knowledge or values but also on the actions of those who act before them.

In particular, whenever dealing with tricky-slippy tools such as bigdata sentiments is better to consider carefully the relevance and impacts of psychology and behaviours. The risk is to gather data that is intrinsically biased (see also My Issue with BigData Sentiments.)

(see also:

D. Amerland: How Semantic Search is changing end-user behaviour

C. Sunstein: Echo Chambers: Bush v. Gore, Impeachment, and Beyond – Princeton University Press

e! Science News: Information technology amplifies irrational group behavior).

5) Sources (Left Hand)

Variety!!! That is one of the three suggested by D. Laney: Volume, Velocity and Variety. Not only choosing the right model is important in order to avoid predictions’ and insights’ biases: what about the reliability of the sources of data that has been used for the analysis? If the data is biased predictions and insights will be biased as well. In particular, any series of data has a variance and a bias that can not be eliminated.

How to mitigate such a risk? By gathering data from different sources and weight them accordingly to its reliability: the variance.

Moreover, as a bigdata scientist and as a consumer as well, never forget positive and negative SEO tactics. There is a social-digital jungle there! (see Tripadvisor: a Case Study to Think Why BigData Variety matters).

Feelink – Feel & Think approach for doing life!

Advertisements

The D.A.I. model to better understand different mindests and cultural values: why social responsibility means higher prices?


Few weeks ago, from a new Twitter follower, I’ve received a direct message with the following question: “Do you spend more money with a brand that you think is socially responsible?”. I felt immediately that it could be either a marketing research or a way to create awareness on something, nothing bad on it whatever it is.

Anyhow, the aim of a question is to gather an information. So which is the information that the question above wants to address? Suddenly came into my mind a principle from information theory: information is an interpretation of data based on assumptions (see figure). Usually assumption are due to culture, mindset and context in general. Think, as an example, how the same gesture of moving the head up and down (data) means yes for Europeans and Westerns but for Indians means exactly the opposite.

information_assumption

So, why not applying such a principle from information theory also for every day life in order to better understand ourselves as well as others? Let’s analyze deeper the question “Do you spend more money with a brand that you think is socially responsible?”

First of all, the question is a close one since the answer must be yes or not. When I’ve realized that I felt myself uncomfortable… why? I thought and I realized that is due to the value of “social responsibility” that in the question is forced to be against “price” (money).

Acknowledge that, I inferred unconsciously that if the answer of the question would have been YES it means that social responsibility is priceless thus more important that money. Vice versa, if the answer would have been NO.

…however, why inferring such considerations? which is the assumption behind? That was my doubt and my hypothesis was that the assumption behind the tricky question “Do you spend more money with a brand that you think is socially responsible?” is: beeing social responsible costs!

…wow, eureka! So, why not creating such conditions so that pursuing social responsibility implies intrinsically cheaper products?

That was my question that I’ve delivered to the owner of the research…and, as an incredible surprise, I’ve receive the following answer: “The impression is socially responsible = higher product cost to the consumer.”

Bingo! The assumption that I’ve inferred is right. There is a kind of cultural impression, suggestion and mindset that unconsciously let us to think (me included) that if you want social responsible products there are no other ways: you have to pay more! Why?

Paradoxically, since people behave according to incentives, if socially responsibility implies intrinsically cheaper prices instead, a virtuous circle will be established!

How to create a context where the assumption “socially responsible = higher product” is replaced with “socially responsible = cheaper product”?

…I don’t know, any idea?

Meanwhile, why not applying the DAI (Data, Assumption, Information) model whenever we inferred quick answers?

Behind each information there is an unknown world of undisclosed assumptions.

Feelink – Feel & Think approach for doing life!

Semantic search algorithm, behaviorism and fairy-tale Snowwhite with the seven dwarfs. Would SEO behave like Grumpy?


How does semantic search work? Which are the implications regarding SEO tactics and users/customers’ behaviors?

Google search is not unlike the “Mirror, mirror on the wall, who’s the fairest of them all?” where the question asked, reveals (in the fairy tale) the Evil Queen’s narcissistic obsession

, what a great metaphor to explain how semantic search works! (see Google Search and the Racial Bias).

I will take the assist from David Amerland to help me to better understand how the SEO world (something still unknown from me) as well as remembering childhood times with the fair tale “Snow White and the seven dwarfs“.

So, let’s have a look at the characters of the famous fairy-tale:

The mirror is the result of the search engine. According to what I’ve understood about semantic search, the mirror reflects back a result that is contextualize accordingly to the user and his/her relationships among the social networks as well as thorough the analysis of past behaviours.

Snow White is the most beautiful creature in the WEB forest. She publishes smart content as well as she establishes such trusted relationships in the social medias so that the mirror (the semantic search engine) reflects back a beautiful princess… accordingly to the algorithm I would say.

The evil queen is the bad guy, attempting to be viewed as the most beautiful in the WEB forest while it is not. The evil queen struggles and suffers a lot for that, since the mirror suggest always Snow White as the best result… the life in the digital jungle is not so easy for the evil queen!

The poisoned apple represents a trick, a negative SEO attack where the objective either is to game the search engine (the mirror) or to compromise the reputation of Snow White. Fake reviews, negative or positive SEO tactics, are just an example of how an apple could be poisoned in order to kill digitally a competitor and game the search engine algorithm (see the case of Tripdavisor).

The seven dwarfs are data scientists and SEO experts that are mining the WEB forest in order to get some valuable and reliable information from the WEB. Usually they are well-intentioned and thus willing to protect the beauty of Snow White from negative SEO (the poisoned apple and the evil princess).

The charming Prince represents all the users, companies and individuals, that go deeper and deeper into the WEB forest in order to discover the truth. Mirror’s result apart: Who is really the fairest in the WEB forest?Encountering few smart dwarf might be useful for the charming Prince, both in the forest to discover the beauty of Snow White and in the WEB to find out great contents and reputations accordingly to personal impressions rather than only relying on algorithms.

…so, which is the moral of the fairy-tail “Snow White and the seven dwarfs” applied to the modern semantic search and SEO?

An interesting point has been pointed out by D. Amerland in his article “How semantic search is changing end-user behaviour“. In particular:

The fact remains that the web is changing, search has changed and the way we operate as individuals, as well as marketers, has changed with it.

Since the semantic search is so powerful to influence the behaviour of the end-user (individuals, companies,…), the point is: what kind of algorithm there is behind the mirror on the wall? Which are the criteria behind the result that identify the fairest princess in the WEB?

More interesting doubt: what happen if the criteria behind the search algorithm (the mirror) change so that the fairest in the WEB would be Grumpy, one of the seven dwarfs? Would all the end-user and SEO really want to become and behave like Grumpy?

seo_mirror_on_the_wall

Barriers to change… Should I stay or should I go? A ripped up speech


Ferdinandeo (Triest), Saturday 21st September 2013 around 12:00 a.m.: what to say as a final speech after attending an MBA program in behalf of all the class?

Here below an idea, a story about barriers to change… delivered here, in a comfortable “context”, with 10 day of delay: should I stay or should I go?

Should I stay or should I go? A story about barriers to change

Should I stay or Should I go? That was the question that each MBA participant has faced when applying for the master program in business administration here at MIB School of Management.

Should I get an MBA in my country or abroad?

The MBA class of the 23rd edition was almost equally distributed: 60% foreigners and 40% Italians. Who made the right decision?

Nobody knows… now!

Anyhow, what all the participants of the 23rd edition have in common, both foreigners and Italians, is that they have started a changing process in a way:

someone changed country, some other quit a job and somebody did both.

Was that easy? Of course it was not!

Why? Because each change requires a transformation process, and each transformation process requires resources:

physically, mentally and emotionally.

So… which are the barriers to change? I would say mainly three:

unawareness, laziness and conservation of the status quo.

The first one, unawareness, means that since I don’t know there is a problem, why to invest resources for a change? How to start a changing process in such a situation? Simply by creating awareness: “Houston, we have a problem!”

The second one, laziness, I know there is a problem but it requires too much resources: physically, mentally and emotionally. In this case the therapy is defining an objective that is attractive enough in order to justify the effort.

The third one, conservation of the status quo, is the toughest: I do know there is a problem and I do not want to change since I feel myself comfortable in the current situation. I am not sure… in this case uncertainties about the current situation and status quo will establish a changing process.

Why uncertainties? According to a passage taken from a speech held here in this hall few months ago: “Since the economy is not growing in Europe and in the Western counties, the only alternative for getting good jobs is to go abroad where the economy is booming”

So… should I stay or should I go? According to this story, I would say: it depends!

It depends on how much uncertain and uncomfortable you are with your current situation and status quo… unless new innovative opportunities and unconventional alternatives will be created from scratch.

All the best for the MBA23 and MBA24 classes!

Thank you!

The story was slightly different and this speech has not been delivered because the “context”, the final ceremony for the MBA23 class, was not comfortable for the speaker.

How to break such a uncomfortable situation? …well, you already know the moral of the story: by creating uncertainties through innovative and unconventional alternatives!

Feelink – Feel & Think approach for doing life!

20130930-101901.jpg

My Issue with BigData Sentiment Bubble: Sorry, Which Is the Variance of the Noise? (NON Verbal Communication)


Why sentiment analysis is so hard? How to interpret the word “Crush” in a tweet? Crush as in “being in love” or Crush as in “I will crush you”? According to Albert Mehrabian communication model and statistics, I would say that on average a tweet for a sentimenter has an accuracy of 7%. No such a big deal, isn’t it?

Let’s think about it by considering, as an example, the case of the sentiment analysis described in My issues with Big Data: Sentiment: crush as in “being in love” (positive) or crush as in “I will crush you” (negative)?

What is a sentimenter? As a process, is a tool that from an input (tweets) produce an outupt like “the sentiment is positive” or “the sentiment is negative“. Many sentimenters are even supposed to estimate how much the mood is positive or negative: cool!

Paraverbal and non-verbal communication

Anyhow, according to Albert Mehrabian the information transmitted in a communication process is 7% verbal, 38% paraverbal (tone of the voice) and the remaining 55% is non-verbal communication (facial expressions, gestures, posture,..).

In a Tweet, as well in a SMS or e-mail, neither paraverbal nor non-verbal communication are transmitted. Therefore, from a single tweet is possible to extract only the 7% of the information available: the text (verbal communication).

So, what about the paraverbal and non verbal communication? During a real life conversation, they play a key role since they count for 93% of all the message. Moreover, since paraverbal and non verbal messages are strictly connected with emotions, they are exactly what we need: sentiments!

Emotions are also transmitted and expressed though words such as “crush” in the example mentioned. However, within a communication process, not always the verbal and non-verbal are consistent. That’s the case when we talk with a friend, he\she saiys that everything is ok while we perceive, more or less consciously, something different from his\her tone or expressions. Thus we might ask: are you really sure that everything is ok? As a golden role, also for every day life, I would recommend to use non-verlbal signals as an opportunity to make questions rather than inferring mislead answers (see also: A good picture for Acceptance: feel the divergences & think how to deal with).

For these reason, the non-verbal messages are a kind of noise that interferes with verbal communication. In a tweet, it is a noise that interferes with the text. Such a noise can be as much disturbing as much the transmitter and the receiver are sensitive to the non-verbal communication. It might be so much disturbing to change completely the meaning of the message received.

Statistic and Information Theory

From a statistic point of view the noise might be significantly reduced by collecting more samples. In Twitter, a tweet is one sample and each tweet have 7% of available information (text) and 93% of noise (non verbal communication) that is the unknown information.

From a prediction\estimation point of view no noise means no errors.

Thus, thanks to BigData, if the sentimenter analyzes all the tweets theoretically it’s possible to reduce the noise to zero and thus having no prediction error about sentiments…...WRONG!!!

Even if the sentimenter is able to provide a result by analyzing all the BigData tweets (see Statistical Truisms in the Age of Big Data Features):

the final error in our predictive models is likely to be irreducible beyond a certain threshold: this is the intrinsic sample variance“.

The variance is an estimation of how much samples are different each others. In the case of a communication process, that means how much emotions are changeable through time. Just for fun, next time, try to talk to a friend by changing randomly your mood happy, sad, angry,..and see what happen with him\her (just in case, before fighting tell him\her that is part of an experiment that you’ve read in this post).

In Twitter, the variance of the samples is an estimation about how much differently emotions are impacting the use of certain words in a tweet, from person to person at a specific time. Or, similarly, by considering one person, how much emotions are impacting the use of words differently through time.

Like in a funnel (see picture), the sentimenter can eliminate the noise and thus reduce the size of the tweet bubbles (the higher the bubble the higher the noise) till a fixed limit that depends on the quality of the sample: its variance.

Sentimenter_Twitter_Funnel

So, I have a question for bigdata sentimenters: which is the sample variance of tweets due to non-verbal communication? Acknowledge the sample variance, the error of prediction of the best sentimenter ever is also given:

error of prediction (size of the bubble sentiment) = sample variance of tweets…

…with the assumption that both samples and algorithm used by the sentimenter are not slanted\biased. If this is not the case, the sentiment bigdata bubble might be even larger and the prediction less reliable. Anyhow, that is another story, another issue for BigData sentimenters (coming soon, here in this blog. Stay tuned!).

Feelink – Feel & Think approach for doing life!

A possible TIP for giving effective feedbacks: wearing a SCARF that is SMART, does it make sense?


Four ways to provide effective feedbacks

Few weeks ago, the Time has posted an article by Annie Murphy Paul: “Four Ways to Give Good Feedback” (originally posted in the Brilliant Report blog). As reported in the post, “feedback is a powerful way to build knowledge and skills, increase skills, increase motivation, and develop reflective habits of mind in students and employees“. Briefly, the four ways suggested to provide effective feedbacks are: 1) supply information about the learner is doing, 2) taking care about how a feedback should be presented, 3) oriented feedback around goals and 4) use feedback to build metacognitive skills (develop the awareness of learning).

What’s stimulated my curiosity is the point 2: How present a feedback in a way that is effective? I think is the toughest aspect because requires something that the brain naturally refused to do: deliver the message according to a mindset that is different! In the article mentioned above, taking care about how feedbacks should be presented means avoiding three things: a) closely monitoring the performances because reduce the self-consciousness of the learner b) providing unique solution like “This is how you should do it ” – because it might be interpreted as an attempt to control, c) establish a sense of competition among colleagues because might reduce the engagement.

Thus, how can be ensured a good feedback in practice? An idea could be to combine together the S.C.A.R.F. given by Social\Cognitive Neuroscience and the S.M.A.R.T. criteria for setting well-defined objectives. Let’s see how the SMART-SCARF matrix works after a brief description of the two models.

The S.C.A.R.F. model from Social and Cognitive neuroscience.

M.D. Liebarman & E.I. Eisenberger provided many insights regarding Social, Cognitive and Affective neuroscience. In particular, in their article “The pains and pleasures of social life: A social cognitive neuroscience approach” they discovered that there are mainly two circuits that the human brain activates: simply, one circuit for the pains and one circuit for the pleasures. Acknowledged that, the social and cognitive neuroscience might be useful also for giving some further specific insights in order to provide effective feedbacks. The S.C.A.R.F. is a framework in which the “approach (reward)” and the “avoid (threat)” instinctive responses, given by the “pleasure” and “pain” circuits respectively, are mainly related to five human social domains of experience: 1) Status – the relative importance to others, 2) Certainty – ability\need to predict the future, 3) Autonomy – sense of control, 4) Relatedness – as a sense of safety with the others and 5) Fairness – as the perception of a fair exchange between people and justice.

More: “SCARF: a brain-based model for collaborating with and influencing other“.

Each one of us has lived different experiences in various environments and thus there are many different S.C.A.R.F.s as well… as a matter of fact, have you ever seen in a shop only scarfs made only by silk or only blue colored?

If someone likes this kind of

SCARF2

, it means that the main dimensions that stimulate the “approach (reward)” and the “avoid (threat)” responses are the Status and the Autonomy. A person with such a S.C.A.R.F. tends to be more competitive because for them winning a game, be the best student or being promoted in their company will more likely activate the “approach (reward)” response. While the “avoid (threat)” response will be activated when they perceive a reduction of their Status. For example, pushing solutions might be tricky since an advice might be perceived from a person with a high Status as follows: “You are giving me advises, because you think you have more skills\experience than me.” – The emotional reaction of such perception is more likely negative. Even if the coach has much more experience and skills than the coachee, avoiding to emphasize\remark such difference will make feel the coachee comfortable.

At the same time, since also the Autonomy dimension is more important than the others, a good mood will be established whenever a sense of autonomy or control increase. For example, that might be achieved by letting to organize the work, schedule and desk. On the contrary, setting, defining and monitoring constantly the performances of such employees will increase the level of control and thus might activate the “avoid (threat)” response.

Now, how is it possible to estimate and figure out which SCARF suit well who is going to receive a feedback? Since it has been described the SCARF model, it’s like wondering which are the preferences regarding clothes and fashion of people: just observe, listen and understand. In other words, before giving feedbacks it’s better to know well each persons. Thus, apart from all the recommendations, some common sense might be useful too.

The S.M.A.R.T model for well-set objectives

As mentioned in the article by Annie Murphy Paul at point 3), a well stated feedback is oriented around goals. A cool and well-known tool for providing well-defined objectives is the S.M.A.R.T. model in which a good objective must be: 1) Specific – What?, 2) Measurable – If you can’t measure it you will NOT handle it, 3) Attractive – Why? What motivated to do such effort?, 4) Realistic – not too difficult and on too easy 5) and Time-scaled – no time limit, no urgency!. The S.M.A.R.T. model might be useful in order to set the objectives for an evaluation feedback as well for the definition of a personal development plan.

As for the dimensions of the S.C.A.R.F., also for the five ones in the S.M.A.R.T criteria each person is more sensitive in some aspects rather than others. Thus, the common sense “know people before” is crucial in order to deliver the feedback in a way that encourage and motivate.

See also the S.M.A.R.T. criteria.

A TIP for giving effective feedbacks: a SCARF that is SMART

Now, given the S.M.A.R.T. criteria for well-defined objectives and the S.C.A.R.F. framework with its five social\cognitive dimensions (Status, Certainty, Autonomy, Relatedness and Fairness), how should be possible to combine these tools together in order to provide feedbacks effectively by engaging people and avoiding threats? Let’s take the S.C.A.R.F. mentioned above with a high perception in the Status and Autonomy dimensions. Which are the “DOs” and the “DO NOTs” for these dimensions?

With a high Status, in order to activate the “approach (reward)” response it’s necessary to recognize the previous achievements\improvements before specifying the new ones (the “S” of S.M.A.R.T) and make them more attractive by emphasizing how the new goals can be an opportunity to achieve a distinctive specialization\quality (the “A” of S.M.A.R.T). Meanwhile, in the “Specific” dimension of S.M.A.R.T, as mentioned above, pushing solutions activate the “avoid (threat)” response and make the coachee uncomfortable and thus unmotivated.

Regarding the Autonomy dimension of the S.C.A.R.F., what is recommended is to give opinions instead of solutions when specifying the new goals\objectives (the “S” of the S.C.A.R.F.). In order motivate (approach (reward)” response) and make the goal attractive (the “A” of the S.C.A.R.F.) provide at least three  possible solutions and alternatives because that will increase the sense of Autonomy and control (only two will create a “dilemma”!). The “DO NOTs” for the Autonomy are linked with the Specific and the Time-scaled dimensions of the S.C.A.R.F. Respectively, avoid to specify only one solution and explain a detailed schedule and plan.

Final Considerations

By combining in a matrix with one dimension for the S.C.A.R.F framework and the other one for the S.M.A.R.T. criteria then it’s possible to define which objectives and how deliver them properly in order to motivate people and reinforce a positive mood in the team, in the work environment and why not, also in our personal life.

More: see also a possible detailed schema for the SMART-SCARF matrix  here (SlideShare).

All the four points mentioned in the post “Four Ways to Give Good Feedback” are present both in the SMART-SCARF, thus nothing new to add. However, organize all the thousands recommendations given by the experience and the Neuroscience research in a structured way such as has been done in the SMART-SCARF matrix might be useful in order to put them into practice.

Well, it’s time to wear and validate the SMART-SCARF in the real world… Do you think it will works?

Feelink – Feel & Think approach for doing life!

Developing Emotional Intelligence with TIP competencies and Kindergarten Cop… Arnold Schwarzenegger!


Emotional Intelligence and TIP competencies

The Emotional Intelligence (EQ) has been defined by Daniel Goleman in his best seller “Emotional Intelligence: Why It Can Matter More Than IQ“. Nowadays, Emotional Intelligence (EQ) is a well-known concept both in the everyday life and in the work environment and it has become as much important as Analytical Intelligence (IQ).

Moreover, EQ is even more important for a “finely Attuned” leader in order to develop a Social Intelligence (See Social Intelligence and the Biology of Leadership by D.Goleman & R.Boyatzis – Harvard Business Review) and thus improve the team effectiveness with better decisions, more creative solutions and more productivity (see Building the Emotional Intelligence of Groups by V.Urch Druskat & S.B. Wolff – Harvard Business Review). In the book mentioned above by D. Goleman, there are many useful TIPs for being aware and learn how to deal with emotions such anxiety, sadness, anger, etc.

Few weeks ago I saw two colleagues of mine that were talking and suddenly one of them started to talk louder and faster. The other colleague, who was listening, when perceived such change in the para-verbal communication he suddenly exclaimed: “Please, calm down! Calm down“. In that moment I realized how the non-verbal communication can have an impact (positive or negative) on our emotions (see also the post “A Good Picture for Acceptance: feel the divergences and think how to deal with“). Is it possible to deal positively with our emotions? The Competencies in The International Profiler (TIP), that I discovered during the MBA, might be a powerful tool  in order to learn how to be more Emotional Intelligent. How? Let’s analyze another similar case from the movie Kindergartener Cop with Arnold Schwarzenegger where some competencies of TIP such Attuned, Copying, Resilience and Reflected Awareness with a pinch of New Thinking played a key role in order to succeed with the tougher species that inhabit a typical kindergartener environment: kids! (see the TIP’s competencies here).

Emotional Intelligence case study: the “Kindergartener Cop” A. Schwarzenegger

See the clip here taken from the movie “Kindergartener Cop” with Arnold Schwarzenegger (3 minutes), before reading the analysis.

So, what happened to our Kindergartener Cop Arnold Schwarzenegger? Let’s start from what our hero pronounced before realizing what’s going on in the class: “Don’t worry! Everything is under control!“.

  • Symptom 1:  once our Kindergartener Hero realized that in the class everything was not exactly under control because there were kids that were shouting, screaming, touching and destroying everything he started to show some signs of impatient. Such a shock and stressful situation made him shout in anger:  “Shuuut Uuuup!“. Diagnosis 1: definitely a low Copying for Schwarzenegger since he was not able to handle the stress. Continue reading