Getting Started *************** .. highlight:: bash - Click here for :doc:`commands` - Click here for :doc:`docker` - Click here for :doc:`kibana` .. tip:: To install ElasticSearch, just follow the instructions on their web site, `The Debian package for Elasticsearch`_. ElasticSearch 6 =============== :: wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list Install ------- :: sudo apt update sudo apt install elasticsearch sudo service elasticsearch start Tips ==== .. tip:: See https://www.kbsoftware.co.uk/docs/app-search.html for diagnostics etc. .. tip:: See https://www.kbsoftware.co.uk/docs/dev-elasticsearch.html for how to install the Phoentic Analysis plugins. Query ===== Match all documents: .. code-block:: json { "match_all": {} } Sample ====== Using HTTPie:: http GET http://localhost:9200/job-index/_analyze analyzer=my_analyzer text="Plymouth" Stats ===== Using the python API: .. code-block:: python from elasticsearch import Elasticsearch es = ElasticSearch() stats = es.indices.stats('my-index') import pprint pp = pprint.PrettyPrinter(indent=4) pp.pprint(stats) stats['_all']['primaries']['docs'] # {'count': 6, 'deleted': 0} DjangoConEU 2015 ================ Based on lucene No need to just index what is in the model. You can cram as much stuff as you want into a document. Does not have to be in a simple key/value format. It will happily accept lists etc. Just has to be in the format of a simple JSON document. We must have an ``_id`` field e.g:: def to_search(self): return { '_id': self.pk 'creation_date' self.creation_date, 'body': self.body, 'score': self.score, 'comments': [c.to_search() for c in self.comments()], } # using the DocType from below return QuestionDoc(meta={'id': d.pop('_id')}, **d) Very easy to query many indexes at once. After loading To verify that the information has loaded into ElasticSearch:: http://localhost:9200/ http://localhost:9200/_search http://localhost:9200/_search?q=bean http://localhost:9200/_search?q=tags:bean http://localhost:9200/_search?q=awful flavor - http://localhost:9200/ will return the version number. - Scoring not relevant when only search for one word. - It used to ignore the common words e.g. ``the``, but not longer. Client:: # this is a very low level api from elasticsearch import ElasticSearch es = ElasticSearch() es.info() es.search(q='awful flavour') es.search(body={"query": {"filtered": {"query": {"bool": {"should": [{"match": {"title": "bean"}}, {"match": {"body": "bean"}}}, "filter": {"term": {"tags": "beans"}}}}) es.indices.get_mapping(index='stack', doc_type='question') # this is better from elasticsearch_dsl import Search s = Search() # one query type s = s.query('match', body='bean') s.to_dict() # another query type s.filter('term', tags='beans') s.query( 'bool', should=[ Q('match', title='beans'), Q('match', title__ngram='beans'), Q('match', title={'query': 'beans', 'fuzzinesss': 2}), ], minimum_should_match='30%' ) # result can use dot notation e.g. result.comment # for the id, we use meta result.meta.id result.aggregations.per_tag.buckets # DocType is just like a Django model # in search.py # ElasticSearch still uses the dynamic mappings from elasticsearch_dsl import DocType class Question(DocType): creation_date = Date() tags = String(index='not_analyzed', multi=True) Question._doc_type.mapping.to_dict() # refresh the actual field types from elasticsearch Question._doc_type.refresh() Question._doc_type.mapping.to_dict() Question.get(id=464) Reply on ``post_save`` being more or less reliable and then reindex everything every now and again:: def update_search(instance, **kwargs): instance.to_search().save() post_save.connect(update_search, sender=Answer) You should have 1 server or more than 2. Do not have 2 servers. This is called *split brain*. .. _`The Debian package for Elasticsearch`: https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html