Content Services

Configuration options for Lytics' suite of content services.

The following configuration options are available within the account settings Content section.

Extract allowed topics



Enable extraction of allowed content topics from the body of the document.

Content topic blocklist



A list of topics to be hidden from the overall content taxonomy. For example, a dinner restaurant may not care about topics related to breakfast and thus block them so as not to clutter affinities, as pictured above.

Content topic allowlist



A list of topics that must be included in the topic graph and candidates for content affinity. For example, building on the previous scenario, the same dinner restaurant may want to ensure that topics they care about are absolutely present.

Content Allowed Query Parameters



A list of query parameters that should be retained during URL normalization such as page id or product sku.

Content domains blocklist



Domains that should not be classified even though there may be events collected from them. Note: to properly filter an exact match against the domain is required such as "example.com" or "sub.example.com"

Content blocked pages



Block any URL with an exact match to an item in this list (including the domain, not including the protocol. i.e. `www.example.com/404.html`, not `https://www.example.com/404.html`)

Content paths blocklist



Prevent classification of any page with a substring match of the path. For instance, `/contact` would prevent classification for any URL that contains `/contact` anywher in the URL.

Content boosted attributes



Content list of IDs/classes to boost during body extraction.

Content created since date



Only include content in the index if the created date is after the specified date.

Custom content properties delimiter



The delimiter to use when parsing custom content topics on HTML meta tags.

Content custom properties



List of meta tags to include as custom topics.

Observe robots.txt in content enrichment

Observe robots.txt and meta directives:

robotstxt - Observe only robots.txt directives.
meta - Observe only directives in meta tags.
none - Do not observe any directives

Content since date



Only include content in the index if the enriched date is after the specified date.

Content domains allowlist



Perform content filtering based on exact matches of domains in a URL. Any entries should include relevant subdomains.

Content paths allowlist



Perform content filtering based on partial matches of any URL component.

Supported Content Languages



List of languages to permit during the content enrichment process. If empty, then only English content will be processed.

Updated 9 months ago