In the world of NoSQL

Searches with leading wildcards stopped working for me after moving to a multicore configuration of Solr, and at the same time removing all the dead entries in my schema.xml.

 

I was getting the following exception whenever I searched with a leading wildcard:

1
2
3
4
5
6
7
Nov 4, 2011 12:14:44 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException:
 org.apache.lucene.queryParser.ParseException:
Cannot parse 'field1:*': '*' or '?' not allowed as first character in WildcardQuery
at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)

 

After diffing the configs, and schemas, I still couldn’t find anything that suggested why it didn’t work.  Finally I found a forum post which said that to perform leading wildcard searches, you have to have a field type defined using the “ReversedWildcardFilterFactory”.  So I pasted the text_rev field from the default schema.xml:

<fieldType name="text_rev" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

and restarted Solr, and it started to work.

 
And this is despite the fact that I’m not using the field type text_rev on any field.

Read more...

§55 · November 4, 2011 · Apache Solr / Lucene · Comments Off on How to activate Solr / Lucene leading wildcards · Tags: , , ,