Category Archives: Conversations API

How Bazaarvoice UGC APIs serve information to its brand & retailers

Bazaarvoice has thousands of clients including brands and retailers. Bazaarvoice has billions of records of product catalog and User Generated Content(UGC)from Bazaarvoice clients. When a shopper visits a brand or retailer site/app powered by Bazaarvoice, our APIs are triggered.

In 2023,Bazaarvoice UGC APIs recorded peak traffic of over 3+ billion calls per day with zero incidents. This blog post will discuss the high level design strategies that are implemented to handle this huge traffic even when serving hundreds of millions of pieces of User Generated Content to shoppers/clients around the globe.

The following actions can take place when shoppers interact with our User-Generated Content (UGC) APIs.

  • Writing Content
    • When a shopper writes any content such as reviews or comments etc. on any of the product on retailer or brand site, it invokes a call to Bazaarvoice’s write UGC APIs, followed by Authenticity/content moderation.
  • Reading Content
    • When a shopper visits the brand or retailer site/app for a product, Bazaarvoice’s read UGC APIs are invoked.

Traffic: 3+ Billion calls per day(peek)Data: ~5 Billions of records,Terabyte scale

High-level API Flow:

  1. Whenever a request is made to Bazaarvoice UGC API endpoints, the Bazaarvoice gateway service receives the request, authenticates the request, and then transmits the request information to the application load balancer.
  2. Upon receiving the request from the load balancer, the application server engages with authentication service to authenticate the request. If the request is deemed legitimate, the application proceeds to make a call to its database servers to retrieve the necessary information and the application formulates response accordingly.

Let’s get into a bit deeper into the design

Actions taken at the gateway upon receiving a request

  • API’s authentication:

We have an authentication service integrated to the gateway to validate the request. If it’s a valid request then we proceed further. Validation includes ensuring that the request is from a legitimate source to serve one of Bazaarvoice’s clients

  • API’s security:

If our API’s are experiencing any security attacks like Malicious or DDOS requests, WAF intercepts and subsequently blocks the security attacks as per the configured settings.

  • Response Caching:

We implemented response caching to improve response times and client page load performance, with a duration determined by the Time-to-Live (TTL) configuration for requests. This allows our gateway to resend the cached response, if the same request is received again, rather than forwarding the request to the server.

Understanding User-Generated Content (UGC) Data Types and API Services

Before delving into specifics of how the UGC is originally collected, it’s important to understand the type of data being served.

e.g.

  • Ratings & Reviews
  • Questions & Answers
  • Statistics (Product-based Review Statistics and Questions & Answers Statistics)
  • Products & Categories

For more details, you can refer to ConversationsAPI documentation via Bazaarvoice’s recently upgraded Developer Center.

Now, let’s explore the internals of these APIs in detail, and examine their interconnectedness.

  • Write UGC API service
  • Read UGC API service

Write UGC API service:

Our submission form customized for each client, the form will render based on the client configuration which can include numerous custom data attributes to serve their needs. When a shopper submits content such as a review or a question through the form, our system writes this content to a submission queue. A downstream internal system then retrieves this content from the queue and writes it into the master database.

Why do we have to use a queue rather than directly writing into a database?

  • Load Leveling
  • Asynchronous Processing
  • Scalability
  • Resilience to Database Failures

Read UGC API service:

The UGC read API’s database operates independently from the primary, internal database. While the primary database contains normalized data, the read API database is designed to serve denormalized and enriched data specifically tailored for API usage in order to meet the response time expectations of Bazaarvoice’s clients and their shoppers.

Why do we need denormalized data?

To handle large-scale traffic efficiently and avoid complex join operations in real-time, we denormalize our data according to specific use cases.

We transform the normalized data into denormalized enriched data through the following steps:

  1. Primary-Replica setup: This will help us to separate write and read calls.
  1. Data denormalization:  In Replica DB, we have triggers to do data processing (joining multiple tables) and write that data into staging tables. We have an application that reads data from staging tables and writes the denormalized data  into Nosql DB. Here data is segregated according to the content type. Subsequently, this data is forwarded to message queues for enrichment.
  1. Enriching the denormalized data: Our internal applications consume this data from message queues, with the help of internal state stores, we enrich the documents before forwarding them to a destination message queue.

e.g. : Average rating of a product, Total number of ugc information to a product.

  1. Data Transfer to UGC application Database: We have a connector application to consume data from the destination message queue and write it into the UGC application database.

Now that you’ve heard about how Bazaarvoice’s API’s handles the large client and request scale, let’s add another layer of complexity to the mix!

Connecting Brands and Retailers

Up to this point, we’ve discussed the journey of content within a given client’s dataset. Now, let’s delve into the broader problem that Bazaarvoice addresses.

Bazaarvoice helps its brands and retailers share reviews within the bazaarvoice network. For more details refer to syndicated-content.

Let’s talk about the scale and size of the problem before getting into details, 

From 12,000+ Bazaarvoice clients, We have billions of catalog and UGC content. Bazaarvoice provides a platform to share the content within its network. Here data is logically separated for all the clients.

Client’s can access their data directly, They can access other Bazaarvoice clients data, based on the Bazaarvoice Network’s configured connections. 

E.g. : 

From the above diagram, Retailer (R3) wanted to increase their sales of a product by showing a good amount of UGC content.

Retailer (R1)1 billion catalog & ugc records
Retailer (R2)2 billion catalog & ugc records
Retailer (R3)0.5 billion catalog & ugc records
Retailer (R4)1.2 billion catalog & ugc records
Brand (B1)0.2 billion catalog & ugc records
Brand (B2)1 billion catalog & ugc records

Now think, 

If Retailer (R3) is accessing only its data, then it’s operating on 0.5 billion records, but here Retailer (R3) is configured to get the ugc data from Brand (B1) , Brand (B2) , Retailer (R1) also.

If you look at the scale now it’s 0.5 + 0.2 + 1 + 1 = 2.7 billions.

To get the data for one request, it has to query on 2.7 billion records. On top of it we have filters and sorting, which make it even more complex.

In Summary

Here I’ve over simplified, to make you understand the solution that Bazaarvoice is providing, in reality it’s much more complex to serve the UGC Write and Read APIs at a global scale with fast response times and remain globally resilient to maintain high uptime.

Now you might correlate why we have this kind of architecture designed to solve this problem.  Hopefully after reading this post you have a better understanding of what it takes behind the scenes to serve User Generated Content across Brands and Retailers at billion-record-scale to shoppers across the globe.

Conversations API Deprecation for Versions 4.9, 5.0 and 5.1, and Custom Domains

This blog post only applies to the Conversations API and does not apply to any other Bazaarvoice product. You are able to identify the Bazaarvoice Conversations API by the following:

  • Path includes ‘data’: http://api.bazaarvoice.com/data/reviews.json?

Code related to the Bazaarvoice Hosted Display does not need modification. It can be identified by the following:

  • References ‘bvapi.js’: http://display.ugc.bazaarvoice.com/static/ClientName/en_US/bvapi.js

Still unsure if this applies to you? Learn more.

Today we are announcing two important changes to our Conversations API services:

  • Deprecation of Conversations API versions older than 5.2 (4.9, 5.0, 5.1)
  • Ending Conversations API service using custom domains

Both of these changes will go into effect on April 30, 2016.

Our newer APIs and universal domain system offer you important advantages in both features and performance. In order to best serve our customers, Bazaarvoice is focusing its API efforts on the latest, highest performing API services. By deprecating older versions, we can refocus our energies on the current and future API services, which we feel offer the most benefits to our customers. Please visit our Upgrade Guide to learn more about the Conversations API, our API versioning, and the steps necessary to support the upgrade.

We understand that this news may be surprising. This is your first notification of this change. In the months and weeks ahead, we will continue to remind you that this change is coming.

We also understand that this change will require effort on your part. Bazaarvoice is committed to making this transition easy for you. We are prepared to assist you in a number of ways:

  • Pre-notification: You have 12 months to plan for and implement the change.
  • Documentation: We have specific documentation to help you.
  • Support: Our support team is ready to address any questions you may have.
  • Services: Our services teams are available to provide additional assistance.

In summary, on April 30, 2016, Conversations API versions released before 5.2 will no longer be available. Applications and websites using versions before 5.2 will no longer function properly after April 30, 2016. In addition, all Conversations API calls, regardless of version, made to a custom domain will no longer respond. Applications and websites using custom domains (such as “ReviewStarsInc.ugc.bazaarvoice.com”) will no longer function properly after April 30, 2016. If your application or website is making API calls to Conversations API versions 4.9, 5.0 and 5.1 you will need to upgrade to the current Conversations API (5.4) and use the universal domain (“api.bazaarvoice.com.”). Applications using Conversations API versions 5.2 and later (5.2, 5.3, 5.4) with the universal domain will continue to receive uninterrupted API service.

If you have any questions about this notice, please submit a case in Spark. We will periodically update this blog and our developer Twitter feed (@BazaarvoiceDev) as we move closer to the change of service date.

Thank you for your partnership,
Chris Kauffman
Sr. Product Manager

Conversations API Inspector

An all too familiar scenario

Imagine you’re a developer working for Widgets n’More. The marketing team just came up with a new cross platform social media promotion. It’s going to involve collecting user generated content in the form of ratings and reviews. As luck would have it you remember your friend on the Ecom Team had mentioned working with a company called Bazaarvoice last year. Widgets n’More partnered with Bazaarvoice specifically for the purpose of collecting and displaying reviews.

In short order you find yourself at developer.bazaarvoice.com where you start reading the Conversations API documentation. There are a lot of fields and parameters and content types. What’s more, some of them appear to be customizable. They offer custom rating fields, tags, context data questions, and additional free text fields. One company might have “rating_value” while another might be using “rating_quality”. It’s also not immediately clear how any those should be displayed in a webform. The fields can even have customizable properties like min and max length.

So, you call your friend hoping she’ll be able to shed some light on the situation. She explains that Bazaarvoice can even configure fields based on content type, like reviews, questions or answers, and make different custom fields available depending on the parent category of a product. Unfortunately it’s been so long since the initial Bazaarvoice implementation that she doesn’t remember what was set up. If only there was an easy way for you to see exactly what fields are available taking all those factors into account…

Conversations API Inspector to the rescue

The Conversations API Inspector was created with the above scenario in mind. It is a web based app that shows what fields can be submitted to the Bazaarvoice platform using the Conversations API for any API key + content type + ID combination. With the Conversations API Inspector our imaginary developer would be able to see what fields are available, how they must be submitted in an HTTP request, meta-data about each field and much more

The Conversations API Inspector is ready to use and publicly available at http://api-inspector.bazaarvoice.com/. It is well documented at our Developer Portal, so instead of repeating that here I’ll leave you with some screenshots.

noHistory topNoFields topWithFields
fields fieldDetailsProperties fieldDetailsSubmission

Epilogue

Even without the Conversations API Inspector all would not have been lost for our imaginary developer. He could have used the API itself to determine what fields are available. In fact this is exactly how the Conversations API Inspector does it. Of course, the Inspector provides a much more user friendly and interactive GUI than the raw JSON or XML returned by the Conversations API. You can read more about how the inspector works at the documentation under the heading “How it works”.

Mashery-powered Bazaarvoice Developer portal is LIVE!

Welcome to the Mashery-powered Bazaarvoice Developer portal. We strive to give you the tools you need to develop cutting-edge applications on the Bazaarvoice platform.

Some changes you’ll notice:

  • You no longer have to login to see documentation. Just click the Expand icon (expand) to drill down to the information you need.
  • If you want to request an API key or need to contact us with a support question, you will need to create and use a Mashery ID (or use your existing one if you access other Mashery-powered APIs). Your current Bazaarvoice developer portal ID will no longer work.

Note that none of your existing API keys are affected by this transition. They will continue to work without interruption.

Thanks for your support of the Bazaarvoice platform.

Platform API release notes, version 5.4

We are pleased to announce that the following functionality has been developed for version 5.4:

  • Submission forms pre-filled for non-anonymous users
  • Full text search on all UGC and on includes
  • Product family queries
  • Photo upload accepts URLs
  • Brightcove Smart Player Javascript integration
  • Story rating field exposed in the response
  • Special product attributes exposed in the response
  • New filtering capabilities

More detailed information on each of these items is listed below. For complete documentation, refer to the Platform API documentation, version 5.4.

Submission forms pre-filled for non-anonymous users

When submitting content, the values of all known submission fields are now returned in the submission response fields. This only affects submissions where the user is not anonymous and the user/userid parameter is provided with the GET request.

Full text search on all UGC and on includes

The following content types were added to the existing search capabilities:

  • reviews
  • answers
  • comments (story and review)
  • stories

All content is now searchable. For a list of all the fields that are searched for any given content type, see the API Basics page.

Product family queries

When filtering by product id, all content from that product’s product family is also returned by default. There is a new excludeFamily parameter that you can set to not return product family content. For examples and full documentation, see the Product Display method page.

Photo upload accepts URLs

The uploadphoto endpoint now accepts HTTP URLs of images in addition to locally stored photos from the client side. For examples and full documentation, see the Photo Submission method page.

Brightcove Smart Player Javascript integration

Brightcove videos can be loaded in a variety of ways. The information necessary to load these videos in the browser is now returned in the Videos block of the response elements. See the API Basics page for details on the new response items that were added to support Brightcove videos.

Story rating field exposed in the response

The story display response has a new block called “StoryRating” that contains two fields:

  • Average score – average of the rating feedback score displayed for each story ID
  • Range – range of the average score

Special product attributes exposed in the response

The product display response has new fields for each of the following five product attributes:

  • EANs
  • UPCs
  • ISBNs
  • ModelNumbers
  • ManufacturerPartNumbers

New filtering capabilities

The following new filters are available:

  • Affiliation filter on reviews
  • Brand answer filters on questions and answers
  • Brand external ID filter for reviews, stories, questions, and products
  • Content locale filter inline ratings (statistics.json)

For more information, see the appropriate method’s documentation.

Platform API release notes, version 5.3

We are pleased to announce that the following functionality has been developed for version 5.3:

  • Hosted authentication – email
  • Feedback submission for comments
  • RatingDistribution (Histogram data) and SecondaryRatingsAverages added to review statistics
  • Time zone changed to UTC
  • Error codes added to form errors
  • Syndication attribution on reviews

More detailed information on each of these items is listed below. For complete documentation, refer to the Platform API documentation, version 5.3.

Hosted authentication – email

Hosted email authentication can be used during submission to confirm the identity of a content submitter. When submitting content for the first time, a user receives an email containing a link. When the link is clicked, the user is directed to a landing page that calls back to the API to confirm their identity. This call results in the generation of an encrypted user token that can be used in subsequent submission calls. Depending on your configuration, the submitter’s content might not be accepted until the confirmation call is submitted. In order to use this feature, you must have hosted authentication enabled for your submission process. If you need more information, read the “Bazaarvoice hosted authentication reference guide” and the submission method documentation for details on the required parameters.

Feedback submission for comments

Feedback submission for review comments and story comments is now supported in addition to the existing support for feedback submission on reviews, questions, answers, and stories. For complete documentation, see the Feedback Submission method page.

RatingDistribution and SecondaryRatingsAverages added to review statistics

New RatingDistribution and SecondaryRatingsAverages blocks have been added to the ReviewStatistics block. You can now see the distribution of ratings for each product, which allows you to construct a rating histogram. You can also see the average rating of your secondary rating dimensions for reviews in relation to products and authors.

Time zone changed to UTC

The API now returns all time data using UTC (+00:00) to avoid the confusion of multiple time zones. The date format has not changed.

Error codes added to form errors

The API response has been updated to return error codes in addition to the existing error message for all form errors. A complete listing of the error codes can be found in each submission method.

Syndication attribution on reviews

All reviews have an “isSyndicated” field set to true or false. If the review is syndicated, a SyndicationSource block is displayed with details of where the review is being syndicated from. Syndicated content can only be returned if the API key is configured to show syndicated content.

Platform API Release Notes, Version 5.2

We are pleased to announce that the following functionality has been developed for version 5.2:

  • Helpfulness and inappropriate content feedback submission enabled
  • ContentLocale no longer filtered implicitly by default
  • Product and category attributes populated as a map
  • Hosted video submission and display updated
  • Inline ratings data exposed for product-based review statistics

More detailed information on each of these items is listed below. For complete documentation, refer to the Platform API Documentation, version 5.2.

Helpfulness and inappropriate feedback

Submission of helpfulness votes and inappropriate feedback can now be done through the API. The response for the user-generated content has also been updated to display the actual inappropriate feedback and total vote and feedback counts. In order for inappropriate feedback to be populated, the API key must be updated.

Implicit default ContentLocale filter removed

There is no longer any implicit ContentLocale filter if none is specified as an argument. If no filter is provided, all content will be returned, regardless of what locale the content is in. There is a default locale defined for every API key. Prior to version 5.2, if the locale parameter was used, it caused an implied ContentLocale filter to be used.

Note that version 5.2 does not change the behavior of explicitly supplied ContentLocale filters. In addition, you can now ask for labels in any locale and specify a different content locale. Therefore, if you request a locale of en_US and a ContentLocale of fr_FR, you get English labels and French content.

Product and category attributes

Products and categories now have a new attributes field populated. This field contains a map of attributes provided to Bazaarvoice from a product feed import.

Hosted video submission and display

Video elements of all content now contain URLs that can be used to embed the video into an HTML page. Bazaarvoice provides boilerplate HTML tags for use with embedding these videos. For more information, see the API Basics page.

Inline ratings data

A new method has been created to provide a quick way to access inline ratings data for products. For complete documentation, see the Statistics Display method page.

Bazaarvoice Platform API Tutorial

We recently released our new Bazaarvoice Platform API. This is a new RESTful API that allows access to much more data and provides responses in XML and JSON. We are really excited to see the types of applications our clients will be building on the API.

For a quick introduction to the API, we created the API Tutorial (you must be logged in to view) that walks through creating a Javascript-based widget for displaying Bazaarvoice content on any webpage. The tutorial explores the JSONP response format of the API and how it can be used with Javascript templates to inject content onto the page.

You can read the API tutorial or go grab the code from Github and start checking out the new API.

Home Depot using the Platform API to show off their UGC

Clients like the Home Depot are using ideas from our Inspiration Gallery to find innovative ways to show off their user-generated content (UGC) and demonstrate the importance of listening to their customers.

The following image is taken from a Home Depot Store Managers meeting which had all store managers as well as suppliers in attendance. It shows real-time reviews on a globe using a Google Earth placemark based on the reviewer’s IP address.

homedepot_earth_app

Want to try it for yourself?

Click here to apply for an API key, then click here to download the reference app. Don’t forget to send us a picture or video of your app in motion.

Platform API Release Notes, Version 5.1

We are pleased to announce that the following functionality has been developed for version 5.1:

  • ReviewStats updated
  • Moderator Codes for user-generated content (UGC) exposed
  • Wildcard character in ContentLocale filter enabled
  • IP address in Content Display exposed
  • API key creation and management added to the client portal

More detailed information on each of these items is listed below. For complete documentation, refer to the Platform API Documentation, version 5.1.

ReviewStats

Review statistics for Products now returns ReviewStats, which account for product family data in the statistics calculations. Prior to version 5.1, the ReviewStats were returning NativeStats, which do not take product families into account.

Moderator Codes

Moderator codes (content codes) can now be explicitly requested and filtered. A new filter (ModeratorCode) has been added and can accept any moderator code values. For a list of all moderator codes, see the API Basics page. In addition, a new value for the attribute parameter (ModeratorCodes) has been added and must be requested in order to filter by ModeratorCode.

Wildcard character in ContentLocale filter

The ContentLocale filter now accepts the asterisk (‘*’) as a wildcard character. This wildcard character represents zero or more characters. Adding the wildcard enables filtering for multiple locales without requiring each one to be listed. For example, &Filter=ContentLocale:eq:en* would return all the English locales (en_US, en_CA, en_GB, etc.). When the ContentLocale filter is set to the wildcard character, it is the equivalent of requesting all locales. It is important to note that the value of the filter cannot start with a wildcard filter. For example, &Filter=ContentLocale:eq:*_US is not a valid filter.

IP address in Content Display

When the API key is configured to allow access to IP addresses, you will either get the IP address back or null inside the IpAddress element. If you are interested in getting access to IP address within an application, contact technical support.

API key creation and management in the client portal

For clients that have signed the API Data Agreement, they can now create and manage their Developer API keys via the client portal. This functionality allows you to self-manage the API keys that you use to create applications using the Developer Platform APIs and tools. A detailed list of the prerequisites and procedures for creating, viewing and editing your API keys can be found in the Bazaarvoice Release Notes, version 5.1. (You have to log in to the Spark portal to view.)

Additon of Wildcard character in ContentLocale filter is going to be very useful as it is required quite often. All the new functionalites might now change and make it easier to access the platform for new version.