Categories
Elastic Search Magento 2

Elastic Search

Use Kibana For Elastic Search GUI

Elastic search will need to be installed via the download on the webpage and started manually on Mac, rather than installing via homebrew as an x-pack error will occur when running Kibana.

Download Elasticsearch, and execute bin/elasticsearch to start.
Download Kibana, and execute bin/kibana-plugin

Navigate to localhost:5601 to access the GUI

Troubleshooting (so early? REALLY?! Yes.)

Summary of Elasticsearch

  • Full-text search engine
  • NoSQL database
  • Analytics engine
  • Written in Java
  • Lucine Based (~ Solr)
  • Inverted infices
  • Easy to scale (~Elastic)
  • RESTFul interfece (HTTP/JSON)
  • Schemaless (kinda)
  • Real-time
  • ELK Stack

HTTP Codes

Elasticsearch uses HTTP codes correctly. So a 201 will be returned when a document is created using calls such as

POST /blog

Returns
{"acknowledged": true}

Schemaless (Kinda) – Mappings

Elasticsearch is mostly schemaless, in that it will guess the datatype of fields. Sometimes, it gets this wrong. To tell Elasticsearch the correct type for a data key, use mappings.

Mappings are explicitly set at index creation time

E.g.

[json]
{
“mappings” : {
“post : {
“properties : {
title: {
“type” : “string”
},
“date” : {
“type” : “date”,
“format” : “E, dd MMM YYYY HH:mm:ss Z”
},
“guid” : {
“type” : “integer”
}
}
}
}
}
[/json]
[json]

Analzers

{
mappings : {

{
“title” : {
“type” : “string”,
“fields” {
“en” : {
“type” : “string”
“analyzer”
}
}
}
}

}
}
[/json]

Show Indexes

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/cat-indices.html

curl -X GET 'localhost:9200/_cat/indices?pretty=true'

Create Index

curl -X PUT "localhost:9200/test_index?pretty"

Delete All Indexes

curl -X DELETE localhost:9200/_all
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-delete-index.html

Delete Index

curl -X DELETE localhost:9200/index_name

API Conventions

Most APIs suppost execution across multiple indices. Different notations are used to perform operations across multiple indexes. E.g. comma separated, wildcard notation, _all keyword,

Date & Time

Elasticsearch allows you to search indices according to date and time. You need to specify date and time in a specific format like

<static_name{date_math_expr{date_format|time_zone}}>

  • static_name is the staric text part of the name
  • date_math_expr computes the date dynamically
  • date_format is an optional date format
  • time_zone is an optional time zone

Common Options For all REST APIs

  • Pretty Result
  • Human Readable Output
  • Date Math
  • Response Filtering
  • Flat Settings
  • Parameter
  • No Values
  • Time Units
  • Byte Size Units
  • Unit-less Quantities
  • Distance Units
  • Fuzziness
  • Enabling Stack Traces
  • Request Body In Query String

URL Access Control

Users can also use a proxy with URL-based access control to the Elastic Search indices.

User has an option of specitying an index in teh URL and on each individual request within the request body for some requests like:

  • multi-search
  • multi-get
  • bulk

Elastic Search in Magento

Elastic search replaces the old and doddery MySQL search in Magento for faster and better matching algorithms. And it’s also nice that customers can checkout when another customer uses the store’s search; something hit and miss in the good ol’ MySQL days. The MySQL search indexer (before deprecation) used to be in the Magento\CatalogSearch module, however the elastic search now takes over the saving of the data, but the CatalogSearch module is still responsible for the preparation of the data. The FullText name is a hangover from (MySQL’s FullText index)[https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html], and is a bit of a misnomer now that Elastic Search is used.

The indexer command is defined in the CatalogSearch module’s indexer.xml file as catalogsearch_fulltext, so to regenerate the elastic search index we can run

bin/magento indexer:reindex catalogsearch_fulltext

This runs the executeFull() method of the Model\Indexer\Fulltext class.

It’s the executeByDimensions method which actually kicks things off, and either runs a batched index or performs the ful index by calling the $saveHandler->saveIndex() method. This is where the Elastic Search module steps in – the $saveHandler here is the Magento\Elasticsearch\Model\Indexer\IndexerHandler class, and this is is the lad who actually saves the index to the Elastic Search server.

The data for the save handler is generated by the rebuildStoreIndex of the Magento\CatalogSearch\Model\Indexer\Fulltext\Action\Full class’ rebuildStoreIndex method, which uses the Magento\CatalogSearch\Model\Indexer\Fulltext\Action\DataProvider‘s getSearchableProducts method.

The product attributes which are to be indexed start off life in the rebuildStoreIndex‘s $dynamicFields variable. This stores attributes by their type, in a rather ironic static array of keys (‘int’, ‘varchar’…). I’m actually surprised there’s not some kind of dynamic fields type provider or something here. Magento loves providers.

So the getSearchableAttributes method is the urchin responsible for getting the product attributes which should actually go into the index. It’s the productAttributeCollectionFactory which is used to create the productAttribute collection, which limits attributes to those which have one of the the following conditions in the catalog_eav_attribute table; is_searchable = 1, is_visible_in_advanced_search = 1, is_filterable > 0, is_filterable_in_search = 1, used_for_sort_by=1,. These conditions are added to the collection using the addToIndexFilter method, which usefully hardcodes the status and visibility to required codes. A plugin could be added to this method to add additional attributes whose settings don’t require any of the searchable flags set.

As an aside, I think it’s worth giving an honourable mention to the misspelled and deprecated catelogsearch_searchable_attributes_load_after event which gets fired here. I’ll bet Amasty rely on this somewhere in one of their thousand modules. The correctly named event is fired straight after this one :D.

A note on Dimensions

I’m not 100% on what dimensions are, but it appears to be a way to create parallel indexes with different parameters. So a different index could exist based on customer group, or website. That’s my best guess at this point 🙂

To identify which index relates to which website on an Elastic Search server, a prefix should be used. This is set in Magento’s Store Configuration.

Elastic Search & Failing to Restart

How to prevent systemd service start operation from timing out