What’s new in ElasticUtils

Version 0.11: In development -============================

Note

This version supports Elasticsearch 1.0 and later. It does not support versions earlier than 1.0.

This supports elasticsearch-py >= 1.0.

API-breaking changes:

Changes:

Version 0.10.2: October 10th, 2014

Note

This has been tested with Elasticsearch 0.90 up through 1.3.4. We don’t support versions of earlier than 0.90.

This supports elasticsearch-py >= 1.0.

This is a bridging release to help people migrate from Elasticsearch <= 0.90 to Elasticsearch >= 1.0.

The next version of ElasticUtils will not support versions of Elasticsearch < 1.0.

Changes:

  • Fixed monkeypatch to work with all bulk op_types (e.g. insert, create, update and delete)

Version 0.10.1: September 22nd, 2014

Note

This supports Elasticsearch 0.90, 1.0, 1.1 and 1.2. It doesn’t support versions earlier than 0.90 or later than 1.2.

This supports elasticsearch-py >= 1.0.

This is a bridging release to help people migrate from Elasticsearch <= 0.90 to Elasticsearch >= 1.0.

The next version of ElasticUtils will not support versions of Elasticsearch < 1.0.

API-breaking changes:

  • requires elasticsearch-py >= 1.0

Changes:

  • Add distance filter
  • Fix tests to work with Elasticsearch 1.2
  • Invert monkeypatch for bulk indexing
  • Fix infinite recursion when pickling MappingType instances
  • Invert monkeypatch–ElasticUtils now requires elasticsearch-py >= 1.0

Version 0.10: August 19th, 2014

Note

This version supports Elasticsearch 0.90, 1.0 and 1.1. It does not support versions earlier than 0.90 or later than 1.1.

ElasticUtils 0.10 does not work with elasticsearch-py > 0.4.5.

This is a bridging release to help people migrate from Elasticsearch <= 0.90 to Elasticsearch >= 1.0.

The next version of ElasticUtils will not support versions of Elasticsearch < 1.0.

API-breaking changes:

  • big ``.values_list()`` and ``.values_dict()`` changes

    .values_list() and .values_dict() will now always specify the Elasticsearch fields property.

    If you call these two functions with no arguments (i.e. you specify no fields), they will send fields=* to Elasticsearch. It will send any fields marked as stored in the document mapping. If you have no fields marked as stored, then it will return the id and type of the result.

    If you call these two functions with arguments (i.e. you specify fields), then it’ll return those fields—same as before.

    However, they now return all values as lists. For example:

    >>> S().values_list()
    [([100], ['bob'], [40]), ...]
    
    >>> S().values_list('id')
    [([100],), ([101],), ...]
    
    >>> S().values_dict()
    [({'id': [100], 'name': ['bob'], 'weight': [40]}), ...]
    
    >>> S().values_dict('id', 'name')
    [({'id': [100], 'name': ['bob']}), ...]
    
  • Removed text and text_phrase queries. They’re renamed in Elasticsearch to match and match_phrase.

  • Removed startswith query. Replace uses of it with prefix.

Changes:

  • Python 3 support (Python >= 3.3)
  • Supports Elasticsearch 0.90, 1.0 and 1.1

Version 0.9.1: April 7th, 2014

Changes:

  • Fixed bug with facets that are both sized and filtered

Version 0.9: April 3rd, 2014

Note

This is a big change. We switched from pyelasticsearch to elasticsearch-py. The Elasticsearch object you get back from get_es is pretty different. When you upgrade to ElasticUtils v0.9, you’ll probably need to rewrite code.

If this terrifies you, read through these notes carefully and/or stay with ElasticUtils v0.8.

Note

This version supports Elasticsearch 0.20 through 0.90. It does not yet support Elasticsearch 1.0. Support for 1.0 and later will be in a later version of ElasticUtils.

API-breaking changes:

  • elasticsearch-py >= v0.4.3 and < 1.0 required.

    ElasticUtils now uses elasticsearch-py.

    Note

    You have to use elasticsearch-py >= v0.4.3 and < 1.0. ElasticUtils does not support elasticsearch-py 1.0. Support for later versions will come in a future ElasticUtils release.

  • pyelasticsearch no longer needed

    You can remove pyelasticsearch and its requirements.

  • thrift now supported!

    elasticsearch-py supports http, thrift and memcache protocols, so you can use any of them now.

  • pyelasticsearch -> elasticsearch-py changes.

    If you called elasticutils.get_es() and got a pyelasticsearch ElasticSearch object and did things with that (create index, create mappings, delete indexes, indexing, cluster health, ...), you’re going to need to make some changes.

    You can either:

    1. rewrite that code to use elasticsearch-py Elasticsearch equivalents, or
    2. write a different function that returns a pyelasticsearch ElasticSearch object and use that

    Rewriting shouldn’t be too hard. The elasticsearcy-py documentation is pretty good and for most things, there’s a 1-to-1 translation. Also, in many cases it’s cleaner, so you’ll probably be removing code.

  • S.all() no longer returns all results

    If you were using S.all() to return all search results, you should change it to S.everything().

  • `S._build_query()` was changed to `S.build_search()`

    This makes the method public and also changes the name and documentation to be more correct.

    If you really need a S._build_query(), add it to your S subclass.

  • Search results metadata is now in the `es_meta` object

    Previously, you would access search results metadata like this:

    obj._id
    obj._highlight
    obj._score
    
    etc.
    

    In order to make those accessible in Django templates, we moved them into an es_meta object. You can now access them like this:

    obj.es_meta.id
    obj.es_meta.highlight
    obj.es_meta.score
    
    etc.
    

Changes:

  • added S.everything() which does what S.all() did

  • index_objects celery task can now take es and index args

  • unindex_objects celery task can now take es and index args

  • added S.suggestions() support

  • added S.query_and_fetch() support

  • added S.search_type()

  • S.facet() can now take a size keyword argument

  • S.facet_couts() now returns a dict of FacetResults objects

    The FacetResults object contains all the data we get back from that section in the Elasticsearch response.

  • SearchResults now has facet data in facets property

  • elasticutils.estestcase.ESTestCase available and cleaned up

    Previously, it was in elasticutils/tests/__init__.py. This makes it so everyone can use the same TestCase subclass we’re using for our tests.

Version 0.8.2: January 6th, 2014

Changes:

  • Allow pyelasticsearch 0.6.1.

    This alleviates part of the problem in issue #163.

  • Add tox.ini file.

    We’re testing with Python 2.6 and 2.7 on Django 1.4, 1.5 and 1.6.

  • Add caching for empty results.

    ElasticUtils will now correctly remember when it got no results from a search and won’t redo the search.

  • Add support for query and filter facets.

  • Attach facets to search result objects.

  • order_by() accepts a dict as the sort field so you can do advanced sorts.

Version 0.8.1: September 13th, 2013

API-breaking changes:

  • Indexable.index overwrite_existing argument default changed

    In v0.8, we added the overwrite_existing argument, but made it default to False. That’s different than what pyelasticsearch does.

    In v0.8.1, we changed the default to True which is in line with what pyelasticsearch does.

    If you were depending on the old behavior, then you need to update your indexing code to set overwrite_existing=False.

Version 0.8: August 19th, 2013

API-breaking changes:

  • pyelasticsearch v0.6 or later now required.

    Further, since pyelasticsearch has released versions that aren’t backwards compatible, we’re now pegging on specific versions.

  • celery 2.5.5 or later now required.

    You can ignore this if you’re not using the Django celery tasks.

  • Indexable.index arguments changed

    pyelasticsearch changed arguments, so we did, too. We dropped the force_insert argument (which wasn’t working) and picked up overwrite_existing.

    overwrite_existing defaults to False which means it will not overwrite existing documents in the index.

    Note

    This was a mistake since pyelasticsearch defaults to True. We changed this in 0.8.1.

Changes:

  • Added support for ``range`` queries and filters.

    range is a nice shorthand for gte and lte.

  • S.filter_raw added

    If elasticutils.S.filter() isn’t doing as it’s told, then you can skip it and use the Elasticsearch API to create the filter clause of the search by hand with elasticutils.S.filter_raw().

  • Moved requirements files to requirements/.

Version 0.7: Released June 12th, 2013

Note

This is a big change. We switched from pyes to pyelasticsearch. In doing that, we changed a handful of signatures, nixed some functionality that didn’t make any sense any more, and cleaned a bunch of things up.

If this terrifies you, read through these notes carefully and/or stay with v0.6.

API-breaking changes:

  • pyelasticsearch v0.4 or later now required.

    ElasticUtils now requires pyelasticsearch v0.4 or later and its requirements.

  • elasticutils.PYES_VERSION is removed.

    Since we’re not using pyes, we removed elasticutils.PYES_VERSION.

  • ElasticUtils no longer supports thrift.

    Pretty sure we did a lousy job of supporting it before—it was all in the pyes code and we had no tests for it.

  • get_es() signatures have changed.

    • takes urls now instead of hosts
    • dump_curl argument is now gone
    • default_indexes argument is gone

    The arguments correspond with pyelasticsearch ElasticSearch object.

    ElasticUtils uses HTTP urls for connecting to Elasticsearch now. Previously, you’d do:

    get_es(hosts=['localhost:9200'])        # Old way
    

    Now you do:

    get_es(urls=['http://localhost:9200'])  # New way
    

    The dump_curl argument was helpful for debugging, but we don’t really need it anymore. See the Debugging for better debugging methods.

    Will now raise a DeprecationWarning if you pass in hosts argument.

  • S searches all indexes and doctypes by default.

    Previously, if you did:

    S()
    

    it’d search an index named “default” for doctypes “document”. That was dumb. Now it searches all indexes and all doctypes by default.

  • S.es_builder is gone.

    es_builder() was there to get around problems with pyes’ ES class. The pyelasticsearch ElasticSearch class is more straightforward, so we don’t need to do circus shenanigans.

    You can probably do what you need to with either the es() transform or by subclassing S and overriding the get_es() method.

  • MLT arguments changed.

    The fields argument in the constructor was renamed to mlt_fields to be in line with Elasticsearch API names.

    Will now raise a DeprecationWarning if you pass in fields argument.

  • MappingType get_indexes renamed to get_index.

    MappingType had a method called get_indexes. This is now get_index because it should return a single index name.

  • Added Indexable mixin for indexing bits for MappingTypes.

  • Django: changed settings.

    Changed ES_HOSTS setting to ES_URLS. This is both a name and a value change. ES_URLS takes a list of strings each is an http url. You’ll neex to update your settings files from:

    ES_HOSTS = ['localhost:9200']        # Old way
    

    to:

    ES_URLS = ['http://localhost:9200']  # New way
    

    ES_DUMP_CURL is gone.

  • Django: removed the statsd code.

  • Django: ESTestCase was improved, documented and bugs squashed.

    It was improved, documented and bugs were squashed. It’s now used by the test suite.

  • Django: Indexable.index() method no longer has bulk argument.

    The Indexable.index() method no longer does bulk indexing. The way pyes did this was kind of squirrely and caused issues if you didn’t have the order of operations correct.

    Now Indexable.index() only indexes a single document.

    But wait...

  • Django: Indexable now has bulk_index().

    pyes would keep track of all the things you wanted to bulk index and then at some point push them all. Instead of doing it under the hood, we added a separate bulk_index() method and now you control how many items get indexed in bulk in one pass.

  • Django: Indexable.refresh_index no longer takes a timeout argument.

    pyelasticsearch ElasticSearch.refresh doesn’t take a timesleep argument, so we don’t need that anymore.

  • Django: Indexable es argument defaults to Indexable.get_es() now.

    Previously it defaulted to elasticsearch.contrib.django.get_es(). Now it defaults to Indexable.get_es() class method making it more flexible.

  • Django: renamed DjangoMappingType to MappingType.

  • Django: moved MappingType and Indexable.

    They were in elasticutils.contrib.django.models and are now in elasticutils.contrib.django. Yay for slightly shorter module paths!

  • Django: ditched the cron module and its helpers.

    It’s not clear they ever worked (issue #21) and there are no tests.

  • pyes -> pyelasticsearch changes.

    If you called .get_es() and got a pyes ES object and did things with that (create index, create mappings, delete indexes, indexing, cluster health, ...), you’re going to need to make some changes.

    You can either:

    1. rewrite that code to use pyelasticsearch ElasticSearch equivalents, or
    2. write and use your own get_es() function that returns a pyes ES object

    Rewriting shouldn’t be too hard. The pyelasticsearch documentation is pretty good and for most things, there’s a 1-to-1 translation.

Changes:

  • pyes is no longer a requirement.

    We no longer use pyes so you can remove it from your requirements.

  • S.execute added

    This allows you to explicitly execute a search and get back a SearchResults instance.

    See elasticutils.S.execute() for details.

  • S.all added

    Allows you to get all the search results possible rather than just the first 10 search results which is the default.

    You should consider using slices instead which allows you to specify the maximum number of results to get back.

    This is dangerous, so it’s been documented with lots of warnings.

    See elasticutils.S.all() for details.

  • Added support for ``match`` and ``match_phrase`` queries.

    Elasticsearch 0.19.9 renamed text query to match query. This adds support for match and match_phrase.

    See Queries: query for details.

  • Added support for ``wildcard`` and ``terms`` queries.

    See Queries: query for details.

  • Reimplemented filter and query implementation.

    The new implementations allow you to add handling for filters and queries that ElasticUtils doesn’t handle as well as override what ElasticUtils does.

    See elasticutils.S for details.

  • S.query_raw added

    If elasticutils.S.query() is getting you down, then you can skip it and use the Elasticsearch API to create the query clause of the search by hand with elasticutils.S.query_raw().

  • Django: es_required_or_50x handles different exceptions.

    Previously it handled:

    • pyes.urllib3.MaxRetryError
    • pyes.exceptions.IndexMissingException
    • pyes.exceptions.ElasticSearchException

    We’re not using pyes anymore, so now it handles:

    • pyelasticsearch.exceptions.ConnectionError
    • pyelasticsearch.exceptions.ElasticHttpError
    • pyelasticsearch.exceptions.ElasticHttpNotFoundError
    • pyelasticsearch.exceptions.InvalidJsonResponseError
    • pyelasticsearch.exceptions.Timeout

    You probably don’t need to do anything about this, but it’s good to know.

  • Django: celery tasks rewritten.

    The celery tasks were rewritten, docs were updated, and tests were added so they work now.

Version 0.6: Released January 17th, 2013

API-breaking changes:

  • S.values_dict no longer always includes id.

    values_dict no longer always includes an ‘id’ field in the fields list if you don’t specify it.

    Specifying no fields now returns all fields:

    S().values_dict()
    

    Specifying fields now returns only those fields:

    S().values_dict('name', 'number')
    
  • S.values_list no longer always includes id.

    values_list no longer always includes an ‘id’ field in the fields list if you don’t specify it.

    Specifying no fields now returns all fields:

    S().values_list()
    

    Specifying fields now returns data for those fields in the order the fields are specified:

    S().values_list('name', 'number')
    
  • Types have changed.

    This is a big change.

    Up through ElasticUtils v0.5, S could take a type and that type was a model. This is now completely different.

    In ElasticUtils v0.6 and later, S takes a MappingType. A MappingType can be related to a model, but it itself should not be a model. This allows us to return search results as a list of MappingType instances which can do things rather than forcing you to do a db hit to get back instances that can do things.

    This is similar to how django-haystack works with the SearchIndex class, except ElasticUtils doesn’t yet support declarative mapping definition.

    See documentation for more details.

  • By default, results are now DefaultMappingType.

    In ElasticUtils v0.4 and v0.5, if the S was untyped and you didn’t specify either values_dict or values_list, then the results would come back as a list of dicts.

    In ElasticUtils v0.5, if the S is untyped and you didn’t specify either values_dict or values_list, then the results would come back as a list of DefaultMappingType.

    See documentation for more details.

  • elasticutils.contrib.django.models.SearchMixin is no more.

    The SearchMixin class is replaced by DjangoMappingType which relates Elasticsearch mapping types to Django ORM models and Indexable which is a mixin that adds a bunch of index-related infrastructure.

Changes:

  • Added _source and _id to the metadata decorated on the search results.

    See documentation for more details.

  • Fixed elasticutils.contrib.django.es_required_or_50x.

    It works better now.

  • prefix filter support.

    ElasticUtils supports prefix filters. You can do this now:

    S().filter(name__prefix='odin')
    

Version 0.5: Released September 4th, 2012

API-breaking changes:

None.

Changes:

  • Added demote transform: it adds boosting query support allowing you to do a negative query which reduces scores for documents that match.

  • The elasticutils version is now available in elasticutils.__version__ as well as elasticutils._version.__version__.

  • Added __in support for queries. Doing:

    S().query(foo__in=['a', 'b', 'c'])
    

    does a terms query now.

  • Added MLT class which does morelikethis.

  • Added API documentation for S, an index, order_by docs, fixed some icky bugs, and generally improved everything at least a little bit.

Version 0.4: Released July 31st, 2012

API-breaking changes:

  • ElasticUtils no longer requires Django.

    If you’re using Django, you should change your import statements from things like:

    from elasticutils import get_es, S, F
    

    to:

    from elasticutils.contrib.django import get_es, S, F
    

    Further, Django helper modules like cron, tasks, and models were all moved to elasticutils.contrib.django.

    We moved ESTestCase from elasticutils.tests to elasticutils.contrib.django.estestcase

    If you don’t use Django, ElasticUtils is easier to use!

  • S no longer requires a type.

    If you’re not using Django, S no longer requires a type. If you don’t specify a type, then ElasticUtils will return results as dicts.

  • Values and values_list changed.

    values() was renamed to values_list().

    values_list() (was values()) now always returns a list of tuples even if you only requested a single field. Previously, doing something like:

    searcher = S().values_list('id')
    

    would return something like:

    [1, 2, 3, 4, 5]
    

    Now it returns:

    [(1,), (2,), (3,), (4,), (5,)]
    
  • Facet functionality was rewritten.

    Changed .facet() to be arg-driven and allow for filtered and global_ flags.

    Changed .facets() to .facet_counts() to match Django Haystack.

    Added .facet_raw() which allows you to do more complicated facets including scripting. This is similar to the original .facet() implementation.

Changes:

  • Overhauled and cleaned up ElasticUtils tests. Running tests can be done with:

    DJANGO_SETTINGS_MODULE=es_settings nosetests
    
  • Default timeout was changed from 1 second to 5 seconds.

  • Added es transform: it allows you to specify the settings with which to create an ES when the search is executed.

  • Added es_builder transform: it allows you to specify a function that builds an ES which will be executed to create an ES when the search is executed.

  • Added indexes transform: it allows you to specify the indexes to use for the search.

  • Added doctypes transform: it allows you to specify the doctypes to use for the search.

  • Added explain transform: it allows you to set the “explain” flag which gives you an explanation of how the score was calculated.

    I also added elasticutils.utils.format_elasticutils which formats the resulting explanation text into something slightly more readable. But it’s likely this will change in the future.

  • Added boost transform: it allows you to do query-time field boosting.

  • Added support for prefix. It’s the same as startswith, but it uses the same word that ElasticSearch uses. At some point, we’ll remove support for startswith.

  • Added support for text_phrase and query_string queries.

  • Added highlight transform: generates highlighted fragments of content that matched the query.

  • Removed requirement for nuggets.

  • Continued to improve documentation.

Version 0.3: Released June 1st, 2012

Changes:

  • Add documentation for debugging, project details and other things.
  • Minor project cleanup to make it easier to maintain and use
  • Make get_es() more useful. It now takes overrides that allow you to configure multiple kinds of ES objects for different purposes.