Pages for logged out editors learn more
You do not have permission to edit this page, for the following reason:
The action you have requested is limited to users in the group: Users.
Short (1-2 sentence) description of the project
The larger projects that this project might be a part of
Subprojects that this project may have generated
Members or Working Groups that contribute to this project
(Optional) any associated code for the project
Whether this project is currently being worked on, or not
Optional: If the project needs some sort of approval from either the general membership or a working group
Free text:
''(You may also be looking for [[ElasticSearch]], where config and maintenance info is kept. this is about the search features in masto generally)'' == Mastodon ElasticSearch == The basic masto full-text search uses [[Uses::ElasticSearch]] via [[Uses::Chewy]]. === Indexing === ==== Summary ==== The chewy indexes are defined in the <code>app/chewy</code> directory - https://github.com/NeuromatchAcademy/mastodon/tree/main/app/chewy The actual work of creating the index seems to happen in * <code>app/lib/importer</code> - https://github.com/NeuromatchAcademy/mastodon/tree/main/app/lib/importer and * <code>app/workers/scheduler</code> - https://github.com/NeuromatchAcademy/mastodon/blob/main/app/workers/scheduler/indexing_scheduler.rb The '''Importers''' define how the individual objects to index are loaded and added to the search index, and the '''Scheduler''' runs the importers periodically. The Indexes themselves describe how Chewy handles each object. When a relevant object is updated, it is added to the indexing queue using an <code>update_index</code> method, (eg. [https://github.com/NeuromatchAcademy/mastodon/blob/6958bd33cd89ecf11a24d6fd4f9f71c5d8c8ae3a/app/models/status.rb#L51 Status:update_index]). The Scheduler feeds each object to be indexed to the importer. ==== Filters ==== Each importer has a set of rules that determine if something should be added to the index. When indexing, the importers also check to see if an object is <code>searchable_by</code> any accounts - described in the next section. Note that these filters make it so far less than all posts that the server contains are indexed. Also important is that by default the indexing does '''not''' respect <code>noindex</code> or <code>nobot</code> tags in profiles. Those rules, summarized: [https://github.com/NeuromatchAcademy/mastodon/blob/6958bd33cd89ecf11a24d6fd4f9f71c5d8c8ae3a/app/lib/importer/statuses_index_importer.rb#L68-L86 '''Status'''] * Statuses that '''Mention''' a local account * Statuses that have been '''Favorited''' by a local account * Statuses that include a '''Poll''' that has been voted on by a local account * Statuses that were '''Boosted''' by a local account. * Statuses that were '''Created''' by a local account. [https://github.com/NeuromatchAcademy/mastodon/blob/main/app/lib/importer/accounts_index_importer.rb '''Account'''] * Accounts that are '''searchable''' - ie. that are not unapproved, suspended, or moved. [https://github.com/NeuromatchAcademy/mastodon/blob/main/app/lib/importer/tags_index_importer.rb '''Tags'''] * All hashtags! (except those that are in unlisted posts? -- Manisha) === Querying === The search service is located in <code>[https://github.com/NeuromatchAcademy/mastodon/blob/main/app/services/search_service.rb services/search_service.rb]</code> It queries indexed objects, and relevant to full-text search also applies the <code>[https://github.com/NeuromatchAcademy/mastodon/blob/6958bd33cd89ecf11a24d6fd4f9f71c5d8c8ae3a/app/models/status.rb#L173-L193 searchable_by]</code> filter in the <code>Status</code> model The <code>searchable_by</code> method returns a list of (local) account IDs that are capable of receiving a given status in a search. Those accounts include: * The local account that '''Created''' the status. * Local accounts that are '''Mentioned''' in the status * Local accounts that have '''Favorited''' the status * Local accounts that have '''Boosted''' the status * Local accounts that have '''Bookmarked''' the status * Local accounts that have '''Voted on a Poll''' in the status. After <code>searchable_by</code> is calculated for a given status, a status can also be excluded from search results if they fail the <code>[https://github.com/NeuromatchAcademy/mastodon/blob/main/app/lib/status_filter.rb StatusFilter]</code> which removes * Accounts that are '''blocked, muted,''' or on a '''domain blocked''' by the post creator * Accounts that have been '''silenced''' by the instance and are not following the creator of the status. <code>StatusFilter</code> also removes statuses that fail the <code>[https://github.com/NeuromatchAcademy/mastodon/blob/6958bd33cd89ecf11a24d6fd4f9f71c5d8c8ae3a/app/policies/status_policy.rb#L10-L21 StatusPolicy:show]</code> check, which includes * Remote accounts if the post is '''local only''' * Accounts that are '''suspended''' * Accounts that are '''not mentioned''' if the visibility of the post is "direct" or "limited" * Accounts that are '''not mentioned''' and '''do not follow''' the posting account if the visibility is set to "private" (followers only) * Accounts that are '''blocked,''' or on a domain that is blocked by the creator of the status. == Other Implementations == * VyrCossant has a parametric search that builds on top of the base masto search: https://github.com/VyrCossont/mastodon/pull/9 == See Also == * [[ElasticSearch]] == References == * ElasticSearch docs https://docs.joinmastodon.org/admin/optional/elasticsearch/ * Chewy - https://github.com/toptal/chewy === Prior Conversations === * Search thread on [[Tech WG]] hacks channel - https://discord.com/channels/1049136631065628772/1094738707086581790/1094738707086581790