Advanced Search
WebDSL offers full text search engine capabilities based on Apache Lucene and Hibernate Search. Current implementation supports:
- Set up which entity properties are searchable
- Full text search on (a subset of) searchable entity properties
- Range queries (numeric and date properties)
- Boolean queries
- Faceted search (both discrete values and ranges)
- Index and query time boosting
-
Customized preprocessing of searchable properties/queries using SORL analyzer building blocks (tokenizers, character and token filters):
- Common analyzers are predefined
- Use custom stop words that are ignored at indexing/querying
-
Get more relevant results by
- Synonym analyzer (ipad i-pad i pad all match the same)
- Stem words to their root word (performing -> perform)
- Phonetic search (words that sound similar are matched)
- Many more
- Filter search results by property value or faceted search
- Sort search results
- Pagination of search results
- Result highlighting
- Spell checking
- Auto completion
- Create search name spaces based on property value
Marking properties as searchable
Using search in WebDSL starts by marking which entities need to be searchable. If one property is marked searchable, the entity can be searched. For each entity property one or more search fields can be specified. There are 2 ways to specify these: using search mappings or using searchable annotations. For simple search functionality, searchable annotations will suffice, but for cleaner code we recommend using search mappings.
Using search mappings (recommended)
A search mapping starts with the name of the property to be indexed, optionally followed by mapping specifications:
as name
Indexed under name instead of the property name.
using analyzer
Indexed using analyzer analyzer instead of the default analyzer.
boosted to Float|*Float
Search field is boosted to Float at index time (default 1.0).
(spellcheck)|(autocomplete)|(spellcheck,autocomplete)
Indicate that this search field can be used for spell checking/autocompletion.
for subclass entity
In case marking an reference/composite property as searchable, you might want to make only a specific subclass of the property type searchable.
depth Int|with depth Int
In case marking an reference/composite property as searchable, you can specify the depth of the 'embedded' path, 1 is the default.
+ mapping specification
Prefix a mapping specification with the plus sign if you want this search field to be used by default at query time.
If no default search field is specified, all search fields are used by default
.
Search mappings belong to an entity and can be placed inside an entity declaration, or somewhere else by adding the entity name. Names of the search fields are scoped to entities, so different entities may share the same names for search fields.
//Embedded search mapping
entity Message {
subject :: String
text :: Text
category:: String
sender -> User
searchmapping {
+subject
+text using snowballporter as textSnowBall
text
category
+sender for subclass ForumUser
}
}
//External search mapping
entity ForumUser : User {
forumName :: String
forumPwd :: Secret
messages -> Set<Message> (inverse=Message.sender)
}
...
searchmapping ForumUser {
forumName using no
}
Using annotations
Search fields can also be specified using property annotations:
//Using searchable annotations
entity Message {
subject :: String (searchable)
text :: Text (searchable, searchable(name=textSnowBall, analyzer=snowballporter)
category:: String (searchable)
sender -> ForumUser (searchable())
}
The above code marks the entity Message searchable, and it has 3 search fields: subject, text using the default analyzer, and textSnowball, which uses the snowball porter analyzer. Searchable annotations have no restriction w.r.t. search mappings, and both can be used interchangeably (not recommended since it's less transparent). The following table shows the annotation equivalent of specifications in search mappings.
| search mapping | <-> | searchable annotation |
| subject | <-> |
searchable searchable() |
| subject as sbj | <-> | searchable(name = sbj) |
| subject using defaultNoStop | <-> | searchable(analyzer = defaultNoStop) |
| subject * 2.0 | <-> | searchable()*2.0 |
| subject boosted to 2.0 | <-> | searchable(boost = 2.0) |
| subject as sbjTriGram using trigram boosted to 0.5 | <-> | searchable(analyzer = trigram, name = sbjTriGram)*0.5 |
| subject as sbjUntokenized using none | <-> | searchable(analyzer = none, name = sbjUntokenized) |
| message as sbjAC using kwAnalyzer (autocomplete) | <-> | searchable(analyzer = kwAnalyzer, name = sbjAC, autocomplete) |
| user as forumuser for subclass ForumUser | <-> | searchable(name = forumuser, subclass = ForumUser) |
| user with depth 2 | <-> | searchable(depth=2) |
| + text as txt | <-> | searchable(name = txt, default) |
Which properties can be made searchable ?
Properties of any type can be made searchable, although there are some notes to make.
Reference and composite properties
These properties don't contain any text or value by themselves, but hold references to other entities. Therefore, the properties themselves cannot be indexed, but the searchable properties of the referred entity/entities will be indexed in the scope of the current entity. For example if you want to be able to search for Message entities by the name of the sender (in the above example), the property forumName of ForumUser needs to be indexed in the scope of Message. This can be done by marking the sender property as searchable. All search fields from ForumUser will then be available for Message, and searchfields are prefixed with 'propertyName.' by default (or different name if specified using as in search mappings). The search field from the example becomes : sender.forumName.
Note: Searchable reference/composite properties need to be part of an inverse relation to keep the index of the owning entity updated with changes in its reference entity/entities. The mapping options available for reference properties are restricted to name and subclass.
Numeric properties (Float,Int,Date,DateTime,Time)
In case no analyzer is specified for a numeric property search field, it will be indexed as numeric fields, which is a special type of field in Lucene. It enables efficient range queries and sorting on this field.
Derived properties
Derived properties are currently only indexed when the entity owning this property is saved/changed.
Searching the data!
For each indexed entity, search functions and a searcher class are automatically generated. For simple searches, the generated functions will suffice. For more advanced searches, the magic is in the generated entity searcher(s).
Using generated search functions
For the example entity Message, the following search functions are generated.
function searchMessage(query : String) : List<Message>
function searchMessage(query : String, limit : Int)
: List<Message>
function searchMessage(query : String, limit : Int,
offset : Int) : List<Message>
The limit and offset parameters can be used for paginated results. It only loads at most the limit number of results from the database (for efficiency/faster pageloading). These functions use the default search fields when searching, and the specified analyzers are applied for each search field.
Using WebDSL query language for full text search
More features are available when using WebDSL's query language designed to perform search operations. The language let you interact with the generated Searcher object for the targeted entity. A reference to (or initialization of) a searcher is followed by one or more constructs in which search criteria can be declared.
//matches Messages with "tablet", but without "ipad"
var msgSearcher := search Message matching +"tablet", -"ipad";
//enable faceting on an existing searcher
msgSearcher := ~msgSearcher with facets (sender.forumName, 20), (category, 10)
List of search language constructs:
Retrieving search results
var searcher := search Book matching author: "dahl";
var results := searcher.list(); //returns List<Book>;
Calling .list() on a searcher returns the search results. Calling .size() on a searcher returns the total number of results.
Simple and boolean queries: 'matching { [{field ,}:] {qExp ,} ,}'
searcher := search Entity matching title: "user interface";
searcher := search Entity matching title, description: userQuery;
searcher := search Entity matching "user interface";
searcher := search Entity matching title: +userQuery, -"case study";
searcher := search Entity matching ranking:4 to 5, title:-"language";
Declares a searcher that matches a simple or boolean query. Fields are optional: if the query expression is not preceded by a field constraint, the default search fields are used (i.e. all search fields if no default fields are defined, see ...). qExp can be any String compatible WebDSL expression or a range expression optionally prefixed with a boolean operator (+ for must, - for mustnot, nothing for should). Range expressions are in the form exp to exp, where exp can be any expression of a simple WebDSL builtin type.
Range queries
searcher := search Entity matching rating: 1 to 3
searcher := search Entity matching rating: startDate to endDate
searcher := search Entity matching rating: -( to sinceDate)
Very similar to an ordinary query. The upper and lower limit of the range are included in the search. Use a white space to leave upper or lower limit open. When using boolean operators, the range needs to be surrounded with parentheses.
Pagination
var searcher := search Book matching author: "dahl" start 1 limit 10
With the start and limit keywords, you can control which results to be retrieved.
Configuration options: '[ {option* ,} ]'
searcher := search Entity on title: q [nolucene, strict matching];
Declare the searcher's options. Available options are:
- lucene: allow lucene query syntax
- nolucene: disallow lucene query syntax
- strict matching: all terms must match by default
- loose matching: at least one term should match by default
Filtering: 'where {filterconstraint* ,}'
searcher := search Entity matching title: "graph" where hidden=false;
Specify a filter constraint. A filter constraint is a field-value expression. Be aware that when using a field-value filter expression, a bitset is constructed and cached to accelerate future queries using the same filter. Thus: only use field-value filters if you expect the same filtering to occur frequently in your application.
Enabling facets: 'with facet(s) (field1, e1), (field2, e2)'
Example:
searcher := search Entity matching title: "graph" with facet (author, 10);
searcher := search Entity matching title: "graph" with facets (author, 10), (rating, "( TO 1)(2 TO 3)(3 TO 4)(4 TO )");
Specify enabled facets. These can be discrete or range facets
Retrieving facets: 'get [all] facets(searcherExp, field)'
facets := get facets(s, author);
facets := get all facets(s, author);
Returns a list: List<Facet> with the facets for the specified field. Use the all keyword to also retrieve older (i.e. already filtered out) facets. Facet objects have the following boolean functions available to apply proper styling: * f.isSelected(): is this facet selected, i.e. filtered? * f.isMust(), f.isShould(), f.isMustNot(): check the filter behaviour of this facet.
Filtering on facet
searcher := ~searcher where selectedDateFacet.must(), selectedPriceFacet.must();
Previously returned facets can be used to narrow the search results. The behaviour of the facet (must, should, mustnot) can be set on the facet object itself (should by default).
Namespace scoping: 'in e1'
searcher := search Entity matching title: "graph" in "science";
When using search namespaces, restricting a search to a single namespace is done using the in keyword followed by a String-compatible expression.
Using an entity searcher
The searcher class that is created for the example Message entity is MessageSearcher. The first advantage of using this searcher instead of the generated functions is the ability to interact with the searcher, for further refinements to the search query, or to get information like the total number of results, or time that was needed to perform the search.
define page searchPage(query : String) {
var searcher := MessageSearcher().query(query);
var results := searcher.list();
var searchTime := searcher.searchTimeAsString();
"You searched for '" output(searcher.query()) "', " output(searcher.resultSize()) " results found in " output(searchTime) "."
if(searcher.resultSize() > 0) {
showResults(results)
}
}
define showResults(results : List<Message>) {
//code to view results
}
The available searcher functions generated for each searchable entity are:
(Dis)Allow use of Lucene in query and filter values (see here)
allowLuceneSyntax(allow : Bool) : EntitySearcher
OR/AND terms in user queries by default (OR is the default)
defaultAnd() : EntitySearcher
defaultOr() : EntitySearcher
Filter results by field value, get filter value
filterByField(field : String, value : String) : EntitySearcher
getFieldFilterValue(field : String) : String
Get limit spell/autocomplete suggestions
The field(s) parameters specify which search field(s) to use for suggestions. Additionally the namespace can be specified, if used. For spell suggestions the accuracy [0..1] can be set
static autoCompleteSuggest(toComplete : String, field : String, limit : Int) : List<String>
static autoCompleteSuggest(toComplete : String, namespace : String, field : String, limit : Int) : List<String>
static autoCompleteSuggest(toComplete : String, fields : List<String>, limit : Int) : List<String>
static autoCompleteSuggest(toComplete : String, namespace : String, fields : List<String>, limit : Int) : List<String>
static spellSuggest(toCorrect : String, fields : List<String>, accuracy : Float, limit : Int) : List<String>
static spellSuggest(toCorrect : String, namespace : String, fields : List<String>, accuracy : Float, limit : Int) : List<String>
static spellSuggest(toCorrect : String, field : String, accuracy : Float, limit : Int) : List<String>
static spellSuggest(toCorrect : String, namespace : String, field : String, accuracy : Float, limit : Int) : List<String>
In/Decrease the impact of a search field in ranking of results by boosting at query-time
boost(field : String, boost : Float) : EntitySearcher
Faceting on a search field
The max parameter defines the maximum facets to collect for that field. For range facets, the ranges are encoded as String, each range defined between parentheses and with both limits of each range included: "(,-100)(-101,99)(100,199)(200,)"
enableFaceting(field : String, max : Int) : EntitySearcher
enableFaceting(field : String, rangesAsString : String) : EntitySearcher
getFacets(field : String) : List<Facet>
filterByFacet(facet : Facet) : EntitySearcher
Specify search field(s) to use for query or range
field(field : String) : EntitySearcher
fields(fields : List<String>)] : EntitySearcher
Specify offset and number of results to return by the list function
firstResult(offset : Int) : EntitySearcher
maxResults(limit : Int) : EntitySearcher
Highlight by surrounding matched tokens in the given text by a pre- and posttag (bold by default) using the analyzer from the specified search field
highlight(field : String, text : String) : String
highlight(field : String, toHighlight : String, preTag : String, postTag : String) : String
Find similar entities (of the same type) based on some text (specify fields using the field(s) function
moreLikeThis(text : String) : EntitySearcher
Set/get the current search query text Note: Query text is returned in lucene syntax iff query is composed from different boolean queries.
query() : String
query(queryText : String) : EntitySearcher
Sort results by field ascending or descending
sortDesc(field : String) : EntitySearcher
sortAsc(field : String) : EntitySearcher
Range query, including the from and to limit
range(from : Int, to : Int) : EntitySearcher
range(from : Float, to : Float) : EntitySearcher
range(from : Date, to : Date) : EntitySearcher
Get the list of results
list() : List<Entity>
Get the number of results
resultSize() : Int
Get the search time
searchTimeAsString() : String
searchTimeMillis() : Int
searchTimeSeconds() : Float
Range queries
Range queries can be performed using the range method. The search field to use for this query can be set through the field method (or the first field in case you use fields). Range queries can be performed on Int, Float, Date, DateTime, Time and String properties. Ranges include their upper and lowerbound.
Filter on specific field(s)
Filters are an efficient way to filter search results, because they are cached. If you expect to perform many queries using the same filter (like only showing Messages in a specific category), using a filter is the way to go:
MessageSearcher.query(userQuery).filterByField("category","humor")
To get the value of a previously added field filter, use the getFieldFilterValue(field : String) method.
More options
Search namespaces
Search namespaces become usefull if you want to allow searches on entities with some specific property value. For example searching Messages by category in the above example. Namespaces have some advantages over using field filters. An index is created for each namespace seperately, instead of one for all entities of that type. Since the indexes are used as input for auto completion and spell checking, the use of namespaces enables suggestion services scoped to one, or all, namespace(s).
Result highlighting
Spell checking
Auto completion
Faceted search
Facets can be displayed in many contexts. For example, when displaying a list of products, you want the product categories to be displayed as facets. Any searchable property can be used for faceting. The values, as they appear in the search index, are used for faceting. So if you use the default analyzer for the category property of Product, categories containing white spaces are not treated as single facet value. For this to work you need to define an additional field which doesn't tokenize the value of the property, for example by indexing this property untokenized:
entity Product{
name :: String
categories -> Set<Category> (inverse=Category.products)
searchmapping{
name
categories
}
}
entity Category {
name::String
products -> Set<Product>
searchmapping{
name using none //or 'name using no' in v1.2.9.0
}
}
Facets can be retrieved through the use of a searcher. You first need to specify the facets you want to use by enabling them in the searcher. A typical example is to display facets in the search results:
(updated April 5th)
define searchbar(){
var query := "";
form {
input(query)
submit action{
//construct a searcher and enable faceting on tags.name, limited to 20 top categories
//more facets can be enabled by separating the (field,max) tuples by a comma
var searcher := search Product matching query with facets (categories.name, 20);
return search(searcher);} {"search"}
}
}
define page search(searcher : ProductSearcher){
var results : List<Product> := get results(searcher);
//facets matching the searchers constraints including facet constraints by prior facet filtering
//var facets : List<Facet> := get facets(searcher, categories.name);
//facets matching the searchers constraints, ignoring already filtered facets,
//so the user can enable facets that are not there anymore because of prior facet filtering
var facets : List<Facet> := get all facets(searcher, categories.name);
header{"Filter by product category:"}
for(f : Facet in facets){
facetLink(f, searcher)
}separated-by{" "}
showResults(results)
}
define facetLink(facet: Facet, searcher: ProductSearcher){
submitlink narrow(facet){ if(facet.isSelected()){"+"} output(facet.getValue()) }"(" output(facet.getCount()) ")"
action narrow(facet : Facet){
if (facet.isSelected()) { searcher.removeFilteredFacet(facet); } else { ~searcher where facet.must(); }
goto search(searcher);
}
}
