Class CustomPostingsHighlighter



  • public final class CustomPostingsHighlighter
    extends XPostingsHighlighter
    Subclass of the XPostingsHighlighter that works for a single field in a single document. It receives the field values as input and it performs discrete highlighting on each single value calling the highlightDoc method multiple times. It allows to pass in the query terms to avoid calling extract terms multiple times. The use that we make of the postings highlighter is not optimal. It would be much better to highlight multiple docs in a single call, as we actually lose its sequential IO. But that would require: 1) to make our fork more complex and harder to maintain to perform discrete highlighting (needed to return a different snippet per value when number_of_fragments=0 and the field has multiple values) 2) refactoring of the elasticsearch highlight api which currently works per hit
    • Constructor Detail

      • CustomPostingsHighlighter

        public CustomPostingsHighlighter(CustomPassageFormatter passageFormatter,
                                         List<Object> fieldValues,
                                         boolean mergeValues,
                                         int maxLength,
                                         int noMatchSize)
    • Method Detail

      • highlightDoc

        public Snippet[] highlightDoc(String field,
                                      org.apache.lucene.util.BytesRef[] terms,
                                      org.apache.lucene.search.IndexSearcher searcher,
                                      int docId,
                                      int maxPassages)
                               throws IOException
        Throws:
        IOException
      • getContentLength

        protected int getContentLength(String field,
                                       int docId)
      • getOffsetForCurrentValue

        protected int getOffsetForCurrentValue(String field,
                                               int docId)
      • setBreakIterator

        public void setBreakIterator(BreakIterator breakIterator)
      • getFormatter

        protected org.apache.lucene.search.postingshighlight.PassageFormatter getFormatter(String field)
        Description copied from class: XPostingsHighlighter
        Returns the PassageFormatter to use for formatting passages into highlighted snippets. This returns a new PassageFormatter by default; subclasses can override to customize.
      • getMultiValuedSeparator

        protected char getMultiValuedSeparator(String field)
        Description copied from class: XPostingsHighlighter
        Returns the logical separator between values for multi-valued fields. The default value is a space character, which means passages can span across values, but a subclass can override, for example with U+2029 PARAGRAPH SEPARATOR (PS) if each value holds a discrete passage for highlighting.
      • getEmptyHighlight

        protected org.apache.lucene.search.postingshighlight.Passage[] getEmptyHighlight(String fieldName,
                                                                                         BreakIterator bi,
                                                                                         int maxPassages)
        Description copied from class: XPostingsHighlighter
        Called to summarize a document when no hits were found. By default this just returns the first maxPassages sentences; subclasses can override to customize.
      • loadFieldValues

        protected String[][] loadFieldValues(org.apache.lucene.search.IndexSearcher searcher,
                                             String[] fields,
                                             int[] docids,
                                             int maxLength)
                                      throws IOException
        Description copied from class: XPostingsHighlighter
        Loads the String values for each field X docID to be highlighted. By default this loads from stored fields, but a subclass can change the source. This method should allocate the String[fields.length][docids.length] and fill all values. The returned Strings must be identical to what was indexed.
        Overrides:
        loadFieldValues in class  XPostingsHighlighter
        Throws:
        IOException
      • loadCurrentFieldValue

        protected String loadCurrentFieldValue()