Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
Sith
Sith
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 59
    • Issues 59
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 9
    • Merge Requests 9
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • AE
  • SithSith
  • Merge Requests
  • !269

Open
Opened Apr 21, 2020 by tleb@tlebDeveloper
  • Report abuse
Report abuse

Sexy search

  • Overview 5
  • Commits 4
  • Pipelines 2
  • Changes 4

The goal of this MR is to solve the search issue #96. Let's assume we have a user with firstname Jean-François, lastname Du Pont and nickname Ai'gnan. Here is a list of search that did not include him previously but now includes him (was and still is case-insensitive):

  • jean françois (missing -) ;
  • jean-francois (missing ç) ;
  • jean francois (both) ;
  • dupont (space) ;
  • françois (not the start of his name) ;
  • aignan (missing ').

You get it, there are a lot of mistakes that humans can do. It also sorts results by User.last_update to avoid putting old accounts at the top of common requests (such as firstname-only or lastname-only requests).

How it works

For those who don't know, the search is handled by Xapian (the search backend) through the haystack library which provides a Django-friendly interface to multiple search backends. Xapian maintains kind of a duplicate of the database (only for models against which we want to search something) which is optimised for search operations. Its "models" are called "indexes" (see core.search_indexes.UserIndex for the user model).

Every time a user is created or modified, it is indexed (through a signal handler) so that Xapian knows about it. For the user search, what is indexed is the string outputted by the core/templates/search/indexes/core/user_auto.txt template. For our example from above, it looks like this:

jean francois
du pont
aignan
jeanfrancois
dupont

jeanfrancoisdupont

As you can see, unicode is removed. There also are kind-of duplicates with different spacing as we are using an autocomplete algorithm: it searches from the beginning of words.

The one I am not sure about is the last one. Its goal is to allow searching without putting a space between the firstname and lastname. Is this useful?

The prod will have to do a ./manage.py update_index, not sure it does it in the upgrade script.

Assignee
Assign to
Reviewer
Request review from
None
Milestone
None
Assign milestone
Time tracking
Reference: ae/Sith!269
Source branch: sexy-search