Skip to main content
Studies

Analysis: Detect Fake Reviews and Ratings

By 23. July 2020April 3rd, 2024No Comments

Do two thirds of all products really have a rating that is too good?

There are more and more articles about “fake reviews” in the press and on various online news portals. This refers to various industries and their internet platforms, from gastronomy to travel to search engines. Mostly it is about Booking.com, Tripadvisor, and Google. And of course, about Amazon.

Interestingly, there exist just some specialist literature or studies on the subject of “fake reviews”. Everyone knows that there are “fake” or purchased reviews, however, the extent and impact have been poorly documented.

The pioneers in uncovering “fake reviews” and combating them are usually consumer protection organizations or similar players. In the UK, this is “Which?”; in Germany, for example, it is Stiftung Warentest.

As review experts, we at gominga are of course also intensively concerned with the topic of “fake reviews” to provide our customers with advice and support. Here we would also like to refer to our chapter “Product reviews on Amazon: Relevance and recommendations for companies” in the book “Amazon für Entscheider” (Springer Verlag, April 2020).

In the following, we have taken a recent article from Stiftung Warentest as an opportunity to take a closer look at the topic of “fake reviews” and the procedure or identification of “fake reviews”. To do this, we used the analysis tool provider ReviewMeta analogously to the evaluation of Stiftung Warentest and carried out and applied an additional comparison evaluation with further review data.

In addition to our findings, we also provide further links on the topic, including current studies by “Which?” from the UK and Stiftung Warentest.

“Fake Reviews” – why at all?

For several years there have been national and international studies that prove the influence of reviews on consumers. A connection between reviews, price, and sales is often demonstrated. Therefore, it is understandable to a certain extent that there are attempts to influence the business positively by manipulating reviews.

“Fake reviews” are usually effective when the product is from a lesser-known brand, as “fake reviews” can help improve online reputation. They also help with products with only a few technical attributes, because then prospective buyers are more dependent on the assessment of other users than with products that differentiate themselves through their technical properties. We have observed that rather low-priced products are subject to manipulation, since here consumers look more at the star rating and less at the review text.

The attack surface for manipulation is the star value, the number of reviews as well as their distribution between the 1-5 stars, as well as individual “high quality fakes” reviews (cf. Which? 2020). Most platforms – including Amazon – now have very rigid rules for writing reviews and are increasingly implementing them with the help of their algorithms, but reviews are still bought or otherwise manipulated. It starts with “Family & Friends” and ends with an already very professional shadow industry with social media groups, payment services, bots, and review hijacking & merging.

 

“How sellers use purchased praise to manipulate customers”

(Stiftung Warentest)

In June 2020, Stiftung Warentest published a report on fake reviews on Amazon. In their research, the testers bought reviews from an agency, including incognito, and then analyzed them for authenticity using the analysis tool provider ReviewMeta. The report also explains manipulation methods.

We used the report as an opportunity to take a close look at the methodology, procedure, and results and to compare them with another data set and to replicate the analysis.

The question and goal for us was: can you really easily identify reviews that have been bought? What is the percentage of reviews bought, and is there a difference between branded products and "no name" products?

Identification of “fake reviews”

How can you recognize purchased reviews? What are the criteria? How can such a testing process be automated for large amounts of data?

Basically, there are various indications that point to manipulation. The authors, on the one hand, and the review, on the other, are scrutinized.

ReviewMeta is one of the most prominent tools on the market to check reviews on Amazon for authenticity. The company first collects rating data from the various Amazon country platforms and then subjects it to an authenticity check. For example, there are currently 5.3 million checked products for amazon.com and almost 650,000 products for amazon.de. Of course, this means that it is a static check and not all current data can always be queried. In addition, the very large amount of data does not include all products and their reviews. ReviewMeta mostly evaluates the top 100 products of different categories. These topseller lists then contain – sometimes more, sometimes less – a mixture of products from well-known brand manufacturers and low-priced no-name producers. The tool is free, so anyone can query individual products and check reviews either at ReviewMeta directly or through a browser plugin. Finally, it should be mentioned that ReviewMeta doesn’t talk about “fake reviews”, but rather: “ReviewMeta analyzes Amazon product reviews and filters out reviews that our algorithm detects may be unnatural.”

ReviewMeta itself doesn't talk about "fake reviews," but rather, "ReviewMeta analyzes Amazon product reviews and filters out reviews that our algorithm deems unnatural."

What did gominga investigate?

We used ReviewMeta’s tool to analyze and compare two different review datasets. The first test group consists of 45 products for which we know of purchased or “incentivized” reviews. The second test group includes 110 products from brand manufacturers for which there are no purchased / “incentivized” reviews. All well-maintained assortments with consistently organically generated reviews, and active customer support that responds to negative reviews and answers questions.

So our question was:

  • Does ReviewMeta recognize purchased reviews?
  • How many reviews does ReviewMeta mark as “unnatural”?
  • How does the star value change after ReviewMeta’s sorting out of “unnatural” reviews?

We were also interested to see if there was a difference between inexpensive, no-name products (often cell phone cases, accessories, cables, water bottles, etc.) and the products of brand-name manufacturers.

The calculation of the star value is important for the analysis and the results. Both ReviewMeta and we can only calculate this using the arithmetic mean of the ratings. Amazon, on the other hand, uses an algorithm that considers various unknown factors. Therefore, there is usually a considerable discrepancy between the star value displayed on Amazon and the calculated value.

Amazon explains the calculation of the star value as follows:

“These models consider several factors, including, for example, how recent the review is or whether it is a verified purchase. They use multiple criteria to determine the authenticity of the feedback. The system continues to learn and improve over time.
We do not consider customer reviews that do not have "Verified Purchase" status in the overall star-based rating of products unless the customer adds further details in the form of text, images, or videos.

Both we and ReviewMeta only consider evaluations with text in our calculations. Although verified pure star ratings without text are included in the overall average, these cannot currently be recorded by external means, neither for us nor for ReviewMeta. These can only be approximately determined by means of a comparison if the product has already been recorded.

 

ReviewMeta review criteria and process

ReviewMeta’s process reviews all ratings using a variety of criteria by which either the rating or the writer is deemed ‘untrustworthy’ – these are:

Author:

  • Brand Repeaters
  • Brand Loyalists
  • Brand Monogamists
  • Never-Verified Reviewers
  • One-Hit Wonders
  • Overlapping Review History
  • Reviewer Participation
  • Single-Day Reviewers
  • Take-Back Reviewers,

Rating:

  • Deleted Reviews
  • Incentivized Reviews
  • Phrase Repetition
  • Unverified Purchases
  • Word Count Comparision

As a result, ReviewMeta creates a catalog which, according to various criteria (see below), reports either a “FAIL” if too many anomalies were registered, a “WARN” for a middle number and “PASS” if none of the criteria were met negatively. We only looked at the fails below.

ReviewMeta does not review the products at regular intervals. In our investigations it happened that the last scan of the respective product was several weeks or months ago or that the product had never been recorded. Some evaluations are based on only one to two, maybe three scans. The conclusions should be viewed with caution, since for some test criteria (such as deleted reviews) it is very important that the data is up to date.

 

Results of the gominga analysis

In the following, we show the results of our analysis for the two test groups in detail, addressing the areas of “adjustment of the star value”, “number of suspicious ratings” and the main criteria of the ratings identified as “unnatural”.

Adjustment of the star value

As far as the sorting out of the “unnatural” ratings and thus the recalculation of the star value is concerned, the relationship between the two evaluated groups is largely similar. The difference between the “old” star value displayed on Amazon and the “new” star value has changed similarly in both groups.

Changes in the first decimal place make up the largest share. Between almost 73% (test group with “incentivized” ratings) and 62% (test group of branded products) show a change of less than 0.5 and thus usually below the rounding limit for the graphically displayed star on Amazon. Of these, between 23% and 19% are even below 0.1.

This adjustment is largely due to the different calculation by the star value shown on Amazon and the arithmetic mean of the recalculation. For the products evaluated here, the fewest reviews (compared to the other clusters) were completely sorted out by ReviewMeta’s rules, i.e., the smallest adjustment took place here.

At 16% for the test group with “incentivized” ratings and 21% for the products of the brand manufacturers, a larger change in the star value of over 0.5 can be determined. This would then result in a change in the graphical representation of the star value on Amazon. For example, 4 full stars are then displayed instead of 4 ½.

In 9% of the products with “incentivized” ratings and in 15% of the products of brand manufacturers, the adjustments made by ReviewMeta even lead to improvements in the star rating. In other words, negative ratings are identified as “unnatural”.

There are no changes at all in 2% of the products.

Adjustment of the number of reviews

The products in the test group with “incentivized” ratings had an average of 41 ratings before analysis by the ReviewMeta tool. The test group of branded products had 423. There are many reasons for this. It could be argued that branded products are inherently better known among consumers and are therefore rated more often. Furthermore, the product presentations of the brand manufacturers are presumably better maintained, and the topic of reviews and Q&A is actively managed. On the other hand, the products in the test group with “incentivized” reviews have significantly fewer reviews, which is the original reason for buying reviews in the first place.

For sorting out and adjusting the quantitative number of reviews, this naturally has the consequence that the possibility of finding a large number of “unnatural” reviews at all is consistently unequal for the two test groups.

In the test group with “incentivized” reviews, 61% had no changes at all in the number of reviews, i.e., 61% had no complaints at all or these did not lead to exclusion. In the case of branded products, this only affected 16%.

 

Main criteria of the ratings identified as “unnatural”

This adjustment is largely due to the different calculation by the star value shown on Amazon and the arithmetic mean of the recalculation. For the products evaluated here, the fewest reviews (compared to the other clusters) were completely sorted out by ReviewMeta’s rules, i.e., the smallest adjustment took place here.

At 16% for the test group with “incentivized” ratings and 21% for the products of the brand manufacturers, a larger change in the star value of over 0.5 can be determined. This would then result in a change in the graphical representation of the star value on Amazon. For example, 4 full stars are then displayed instead of 4 ½.

In 9% of the products with “incentivized” ratings and in 15% of the products of brand manufacturers, the adjustments made by ReviewMeta even lead to improvements in the star rating. In other words, negative ratings are identified as “unnatural”.

There are no changes at all in 2% of the products.

Fake Reviews: Fail Kriterien

Deleted Reviews

ReviewMeta analyzes the reviews of the respective product at irregular intervals. If ReviewMeta determines that a previously published review is no longer available, it is considered as a “Deleted Review”. ReviewMeta does not know why the respective review was deleted. If many reviews are deleted, this behavior is considered dubious for ReviewMeta.

However, there are numerous reasons for Amazon to delete a review or not to allow it at all:

[...] References to pricing, product availability or alternative ordering options should not be mentioned in customer reviews or under questions and answers." And "Customer reviews and Q&As should refer to the specific item. Feedback about Marketplace sellers or shipping issues can be provided [elsewhere].

(Source: https://www.amazon.de/gp/help/customer/)

So, for example, if a review criticizes shipping or Marketplace merchants, Amazon will remove such a review.

If the manufacturer or its customer support is active on Amazon and forces the deletion of such unsuitable reviews, this brings the product minus points according to the criteria of ReviewMeta.

Amazon also deletes reviews if there is a clear reference to product tests and it is not Amazon’s own Vine program. However, personal data or references to other shops also lead to the deletion, all of which are criteria that do not allow any statement to be made about the authenticity of the rating. In contrast, the number of reviews that are deleted by the user himself is, in our experience, negligibly small.

And again, the note about the data collection by ReviewMeta: The calculation of deleted reviews at ReviewMeta is based solely on a ‘before and after’. In other words, a statement about deleted reviews can only be drawn if the same product has already been recorded before and a review with a certain time stamp is missing in the next scan – as ReviewMeta itself explains:

We don't have a magic ability to collect every single deleted review; we can only identify reviews as deleted if we collect them on one date and then notice they are no longer visible on a subsequent date.

Source: ReviewMeta
Rake Reviews: deleted Reviews

Suspicious Reviewers 

ReviewMeta differentiates between four different cases: “One-Hit Wonders”, “Single-Day Reviewers”, “Never-Verified Reviewers” and “Take-Back Reviewers”. This is, for example, a person for whom at least one of the reviews submitted has been deleted.

[An amount of] reviewers have had at least one of their past reviews for another product deleted. This is an excessively large percentage of Take-Back Reviewers which may indicate unnatural reviews.

Source: ReviewMeta

Unverified Purchases

ReviewMeta classifies reviews as “unnatural” without a prior purchase of the product on Amazon.

Reviews without a previous, verified purchase will be restricted by Amazon. Even if the frequently cited Amazon rule that customers “[…] can write up to 5 reviews per week that are not marked with the addition “verified purchase”” has currently disappeared from the guidelines, the option of submitting reviews remains cannot be assigned to a purchase, and only an “unverified” rating can be given if verified ratings are also used more frequently. However, if a product has a high number of unverified reviews, Amazon reserves the right to block this product from reviewing activity. Apart from that, various portals also offer incentivized verified reviews. Whether the criterion Unverified Purchases is a unique identifier for a “purchased” valuation cannot be clearly justified from our point of view.

Word Count Comparison

This criterion shows an accumulation of evaluations with a certain length, i.e., number of words. ReviewMeta measures the length of a review text and then compares it with other reviews of comparable products.

A ‘FAIL’ for this criterion means that a large number of all ratings for this product have a certain number of words, e.g., between 6 and 15.

The following graphics illustrate the “Word Count Comparison” criterion and show ratings with 6 to 15 words. The examples on the Amazon site are organically generated reviews from verified purchases. It remains questionable whether such an assessment can be classified as “unnatural” due to the length of the text.

Phrase Repetition

ReviewMeta is looking for an accumulation of certain words and expressions here. Our evaluation shows that there are seldom references to product tests, but terms such as “quality” and “price” or “am satisfied” are mentioned particularly often. According to our observation, this corresponds to the average evaluation vocabulary and is not yet an abnormality in itself.

In the data set of the test group of the “incentivized” products, the expression “was made available to me free of charge for test purposes” was found in only one evaluation. The other hits related to “quality”, “price” and “design”.

Reviewer Ease

This is the observation that a reviewer has a particularly good overall average of his reviews, i.e., the reviews usually have 4 or 5 stars.

Brand Repeats & Overlapping Review History 

The criteria brand repeats (with only 1% corresponds to 1 hit) or overlapping review history (with 6% corresponds to 5 hits) are only found relatively rarely. Behind the terms there are on the one hand evaluators who rate products from one and the same brand more frequently and on the other hand different evaluators who often rate the identical products over the same period of time. This suggests a pool of evaluators from a product testing agency. However, it is precisely these relatively unique identifiers that can only very rarely be registered.

 

gominga’s conclusion – it remains difficult

“Fake reviews” will continue to occupy us. Consumer protection authorities are important bodies in order to make the negative effects transparent and to exert pressure on platform operators such as Amazon.

Our analysis shows how difficult it is to identify “fake reviews”. Even with the help of technical tools like ReviewMeta, it is difficult to distinguish between authentic and “unnatural” reviews. The criteria for identifying “fake reviews” are often ambiguous, and so purchased reviews cannot always be found and sorted out with absolute certainty.

Our analysis also illustrates the differences that can exist between different product groups and product ranges.

Based on our analysis, we could not clearly establish that most of the reviews are ‘fake’ and that the vast majority of products are rated too well. We were only able to determine a better rating average for a maximum of a quarter of the analyzed products, which could not be put into perspective by Amazon’s calculation method. In most cases, the degree of adjustments is below the rounding limit at Amazon – the deviation due to Amazon’s own algorithms for calculating the star value is much larger.

Optimize your review management with gominga

Protect your business from the negative impact of fake reviews and optimise your online presence with the gominga review manager. Or request a free demo here to experience our customised solution to promote authentic customer reviews and boost your trust with consumers.

Further sources:

Leave a Reply