Good product data from a PIM system is the key to a successful e-commerce business
Online retail is booming. Thereby the number of products and services offered in the E-Commerce sector, as well as the background data are increasing. For this reason, companies require high-quality and unique product descriptions on an increasingly large scale. The creation of product descriptions is essential and is simultaneously among the biggest organizational, manual and financial challenges for the online retail sector. However, if companies approach the topic of product descriptions without any preconceptions, it quickly becomes clear what high-quality product descriptions can accomplish:
- Direct influence on conversion in the buying process.
- Lower number of returned goods
- Higher visibility through search engines
- Higher website traffic
- Reduced expenditure of customer queries
- Positive product and customer experience and resulting customer loyalty
With these points in mind, it is worth investing in high-quality product descriptions. In particular, product descriptions that have to be created at short notice should receive more attention - such as descriptions of seasonal products, long-running products or new collections. The challenge is to manage the numerous product descriptions and at the same time to keep them up to date - and all of this within a financially acceptable budget. This is where the automated generation of product descriptions, Natural Language Generation (NLG), comes into action.
The PIM system improves the quality of your product data
NLG can only work with a good database. The condition for functioning is available and constantly maintained data. Data can be collected, archived and used in different ways. A PIM system can improve the quality and accuracy of the data, which leads to the optimization of subsequent business processes. The PIM system (Product Information Management System) is a central place for all project-relevant information, specifications and digital files. The relevant product information for online shops can be exported from a PIM system to countless channels.
Data quality and completeness influence the quality of the automated descriptions
In content automation, data quality has a decisive factor of influence on the quality of the results. The basis of success in the sale of products and services is top-quality product data. Product information is provided by manufacturers and must be prepared by retailers for the target group. In some cases, weak points exist. Manufacturer specifications often fulfill legal requirements, but is not always comprehensible to laypersons. Therefore, as a rule, manufacturer information needs to be enhanced. Motto: Content is king. This means that the more precise and targeted the data, the better. The completeness and quality of product data are crucial for a positive shopping experience.
Data Quality Rules facilitate the control and monitoring of data quality
A PIM system helps to technically ensure the basic data quality and data structure. By defining default values and mandatory fields, the data quality is enabled through direct influence. These defaults are part of far-reaching data quality rules that can be configured within a PIM system, making it easier to control and monitor data quality. A good PIM is able to generate additional, new attributes from existing attributes.
In practice, however, it has been shown that the quality of the product data is not yet sufficient in many planned projects to fully exploit the potential of content automation using NLG and to achieve the desired content quality. In these cases, different scenarios are conceivable as to how automated content generation can nevertheless be started. For example, the sets of rules can be created in the text engine in parallel with the data preparation that is taking place. Multi-stage procedures are also conceivable, in which the available attributes are used in a first step and later, as soon as they are available, more are added to the descriptions.
The need for completeness, availability, granularity and consistency of product data
In order to be able to automatically create text content from product data, such as product descriptions, the data must be: 1. complete, 2. available, 3. granular and 4. consistently available.
1. Completeness of data
In automated content generation, it is essential to ensure that the necessary data or product attributes are captured completely. It is not necessary to capture every aspect of a product, but it is crucial to name the attributes that are relevant for adding text and to design a corresponding structure system. This structure system can then be transferred into a data model within a PIM system.
2. Availability of the data
On the surface, the data source does not play a major role for content generation with an NLG system. It is irrelevant whether the structured data is read from a table or communicated directly with corresponding source systems via REST API. In a fully automated workflow, however, it is necessary to identify a leading data source that provides the product data. Usually, this is the PIM system, which is connected via Connector, for example.
3.Granularity of the data
Granularity describes the state that the available attributes are recorded as separately as possible from each other. This requirement is not only relevant for the automation of product descriptions, but it also contributes to an improvement of product search. If the product data is not sufficiently granular, it can be divided and structured in an intermediate step, which then makes the product data usable for the automation process.
4. Consistency of the data
The fourth important point is the consistency of the maintained product data. Different spellings of the same characteristic values make it more difficult to create training courses, as these errors have to be intercepted and corrected by the NLG software. Prospectively, consistent notation of characteristic values can be ensured, for example, through default values for individual data fields. The optimization of already existing product data also poses a challenge.
High data quality through systematic product data onboarding
How can the necessary completeness, availability, granularity and consistency of product data be efficiently achieved? The basis for it is systematic product data onboarding, with which the desired high data quality is achieved, in order to be able to generate high-quality automated texts.
In the onboarding process, certain steps are particularly relevant with regard to data quality: The completeness of the data can be ensured in the attribute mapping step, i.e. the category-specific assignment of the supplier attributes to the PIM attributes. Here it is checked whether the defined mandatory attributes per category have been delivered by the supplier. The mapping of the supplier values to the PIM default values also reveals possible gaps. Based on these two onboarding steps, missing product information is identified and can be requested from the supplier or manufacturer in a targeted manner. It is also possible to enhance the information with content from a content provider who makes product data and other information available against payment.
Availability of data depends on the degree of automation of the onboarding process
Availability depends primarily on the degree of automation of onboarding - the more automated the process, the faster the data can be transferred to the PIM and used for automated content generation. The more open the PIM systems used are with regard to data import and export, the easier it is to use external onboarding systems and ensure the efficient and problem-free exchange of data.
Granularity of the data through text extraction
In order to increase the granularity of product data, text extraction is often mandatory, as certain attributes are not provided in a structured form by the supplier. In this process, the relevant attributes are extracted from semi-structured content from the manufacturer, such as the product titles or the unstructured product descriptions. The attributes extracted from the texts can then be used for additional filters in the online shop, for example.
Consistency of data through standardisation of values
The consistency of product data can be ensured in the normalization step: In this process, the numbers, units, lists of values or spellings present in the data are standardized. Abbreviations can also be resolved or synonyms can be converted into a matching designation.
Once all these steps have been completed, the prepared product data is of high quality and available. It can be imported into the PIM system and made available for further use from there. An onboarding platform is thus a valuable addition to the PIM to ensure that the data is complete and available, having the required granularity and consistency.
To conclude, the lack of granularity, consistency and completeness of the data fields make the trainings created within the NLG solution error-prone, unnecessarily complex and more difficult to maintain. On the other hand, if the availability of the product data is not guaranteed, no automation can take place at all.
The PIM system plays a key role in an NLG system landscape. As a central data source, it provides all relevant product data for the NLG software and enables the transfer to different channels, such as the online shop.
Do you want to learn more details about the importance of data for the automation of product descriptions? Then take a look at this white paper.
About the author
hmmh is partner and managed service provider of AX Semantics and has written this article on the topic of good product data from a PIM system. As Management Service Provider, hmmh helps customers with a full-service component. AX Semantics highly recommends hmmh to customers if external manpower is required.
The company is a market leader in Connected Commerce: hmmh brought e-commerce to Germany for more than 20 years ago. Since then, the company has been shaping the developments in this field. For hmmh, connected commerce is the logical continuation of the multi-channel business, in which channels become touch points and boundaries between online and offline disappear – at any time, at any place and across all devices, always the right content.