Regarding Search Coming to Mastodon (Thread)

Updated 08132023-172112


Full Text Search is coming in a future Mastodon patch. Here's what you can do to opt out, if you don't want your posts or profile to be indexed and included in full text search results;

https://dotart.blog/dotart-blog/full-text-search-coming

(See replies to this post too)


@Curator fuuuck that is so annoying :( I like just having hashtags searchable >_<

I can't seem to parse if hashtags will even be usable if you opt out?


@evel They will - you'll need to opt out at the profile level (turning off that checkbox), so that even your public posts don't get indexed for text search, then you can use public posts with hashtags that will show up in the hashtag search results but not text-search results .


@Curator ty! i have been trynig to figure this out on my own (mistake)


@evel The tech people in fedimins had to go into the main Discord and ask Claire the nuances of how this will work to understand it XD


@Curator ohhh ok that makes me feel better about my barely working brain!!


@Curator i think there’s an important detail missing in your description pf the feature: it adds full text search for posts from other servers. Full text search over obly the local posts pf a server is already in Mastodon and has been for a long time.


@esther Ah yeah, I'll clarify. We don't have elastic search enabled here, hehe.


@esther @Curator (I can't see what you replied to, I blocked that server due to seemingly bad administration) isn't remote fulltext search something that was deliberately excluded from Mastodon due to harassment potential?


@esther ...is it? full text search absolutely does not work for me on mastodon.social. Full text search only works for people I follow.


@SirTapTap it has additional limits by desing, it only i cludes your own posts, favs, bookmarks and I think also replies that you received


@Mayor_of_Smartarse That features has no impact on Mastodon’s search at the time and only concern search engines like Google, Bing, etc.

@Curator


@Mayor_of_Smartarse @Curator that one is for external search engines only, like Google or Bing. You can have either or both of them on or off


@Curator blimey, that’s a big change


@Curator
nooooooooooooo


@Curator@mastodon.art I assume each server can only search the posts that have been federated to the server. Correct?


@Blue Yes :)


@Curator do you have any link to the announcment?


@tommy https://github.com/mastodon/mastodon/releases/tag/v4.2.0-beta1


@Curator cab you give me some word to search for in the changelog? The term "search" doesn't lead me anywhere


@tommy my bad, I linked you the wrong thing XD https://github.com/mastodon/mastodon/pull/26344


@tommy also see https://oisaur.com/@renchap/110856034083717387


@Curator Thanks for making this understandable to laypersons. I dutifully read a lot of the techy posts, but it's normal for 50-75% of the terminology to just whoosh right over my head. ☕


@Curator

I absolutely love it!

Whilst hashtags are great, especially to follow, to find something more specific, full-text search will be a true leap forward!

As long, as there is an opt-out-option (there is, apparently), I don't think, there's a reason to complain.

You guys are awesome!


@Mina it's not us that made it, it's the Mastodon dev team - I'm just posting about it to inform our community here on mastodon.art :)


@Mina @Curator

I like it too. Text search is SUPER important for me. I follow a lot of COVID scientists that just don't hashtag anything, becuase they just don't understand how they work or they're (understandably) too busy to care.


@Curator
@oliphant
Quick question - does " Suggest account to others" work cross-instance, or only on the instance you are on?
If the latter, those with their own personal instances will be able to avoid it without losing any functionality, while everyone else will have to make a choice.


@stuartb @Curator Recommend should work across the network, not just on your own server.


@Curator since there is going to be the option to opt-out, i think this can be awesome! searching by text can be helpful sometimes


@Curator Personally I prefer hashtag search, but I know some dislike it.
But this feels like it is for the interoperability between mastodon.social and the opt-in instances and (1)(https://ohai.social/tags/META) / (2)(https://ohai.social/tags/Threads) - I rather suspect it might be.
(3)(https://ohai.social/tags/fulltextsearch) (4)(https://ohai.social/tags/hashtagsearch) (5)(https://ohai.social/tags/ChangingFediverse)


@robchapman @Curator Weird thing is, Instagram only has hashtag search, no? And Threads doesn't have any search so far. 🤔


@tokyo_0 Oh right, I don't use any META products, I was under the impression Threads had a text based search - because how is it even usable without search?
Perhaps the creation of text search, fosters an environment with a more familiar search function for its users when Threads federates.


@Curator Oh hell yes. If they can just make it work for unlisted posts as well then we're finally at parity with Twitter


@hex if they do that, they need another post privacy selector that doesn't get indexed. It's already not granular enough, imo. Folk need many ways to be able to opt in and to choose not to opt in, globally and per individual post, without sacrificing overall discoverability of their profiles.


@Curator Yep, fully with you on that. Personally speaking I just want to be able to ignore the entire existence of the local and federated feeds without being totally invisible


@hex That’s not making us “at parity with Twitter”. We had a strong anti-harassment feature that’s going to be removed.

@Curator


@hex @Curator it's probably a good idea to have unlisted toots not to be searchable for many reasons, but I think one of the big ones is to make public posts more distinct.

Of course, anyone can choose to have all their toots and responses set to public and to broadcast every musing and answer they do (I've done it sometimes in the past), but that's a bit unusual, so keeping unlisted as the default makes sense.


@hex @Curator It is possible for an instance to index unlisted posts and let only the author search it, but there's significant additional processing required. If that's what you need, it might be good to get your own instance and control everything that way.


@Curator supposedly I am an opt out mastodon user and one of my toots (let’s say unlisted) gets boosted by an opt in mastodon user, would that toot be indexed too?

I have a feeling that it would get indexed too (just as search engines opt outs getting indexed by a search engine opt in user’s boost)


@joyfuluselessness no, because the post itself is unlisted - even if you, as the user, globally set your profile to opt in, it will still only index public posts.

I can't say the same for a public post though, I'll see if I can find out.


@Curator thanks <3


@joyfuluselessness it shouldn't affect it: the setting is set on the originating toot/profile, the boost itself doesn't contain anything to index, it's just a link to the original toot (thanks @rrgeorge !)


@Curator nice nice


@Curator this sounds like a good thing but based on the caution and apprehension surrounding it, is there anything I should be worried about with my content?


@lesbunny the reason it's been viewed with apprehension is it could be used as a vector for harassment, perhaps most excellently demonstrated recently by the Universeodon admin who appeared to be namesearching himself to dive into the mentions of people talking about him.


@Curator ah. embarrassing to admit but I thought this was already possible since I came from Twitter.


@lesbunny @Curator it is already possible. full text search of your local cache has been in the *oma and *key server software since 2018. Gargron is just adding it to mainline after a lot of Mastodon forks started adding it too, and his users started asking why they didn't have it too. Before that, it was condemned as the most evil thing in the world - much like quote posts, which are also widely available elsewhere in the fediverse.


@davidgerard @lesbunny please don't flippantly dismiss the very valid harassment concerns people have about features that have been used to harass them on other sites. You can express your personal desire for a feature without resorting to that.


@Curator @lesbunny I am telling you that it's been here for ages. If you don't understand that, then you are being sold a pile of nonsense. And mastodon.social has been selling a lot of utter nonsense.


@davidgerard @lesbunny XD a) been here since 2016, b) we don't have elastic search enabled on .art, c) I know other instances already have both elasticsearch and full text search, d) my point wasn't about any of this, but about dismissing people's valid concerns about tools used for harassment.


Yeah @davidgerard@circumstances.run I agree with
@Curator@mastodon.art here. Mastodonians have strongly and repeatedly expressed their opposition to non-consensual search. It's true that other software and some Mastodon instances have had non-consensual search for a while; and it's also true that other security and privacy weaknesses in Mastodon software mean Eugen's repeated claims about how much safety the absence of search here actually brings are very inflated. But instead of saying "ok fine, we'll just do non-consensual search too", a better option is to do an opt-in search here.

It's frustrating because the PR is actually close to an opt-in approach. All they need to do is add a new "searchable" property, off by default (or alternatively change the "discoverable" property, which is already off by default, to be an enum with various options). In its current form, though, it retroactively and non-consensually "opts in" (hahahaha) past public posts which were set to discoverable. That's not what people were consenting to when they marked their profiles as discoverable, so it's not informed consent -- it's the kind of a deceptive practice Facebook uses.

BTW it wouldn't surprise me at all if the implementation gets modified to go this route. But still, it would have been a lot better if they had circulated the proposal for feedback at the design stage -- or at least before approving the PR.

@lesbunny@urusai.social


@jdp23 @Curator @davidgerard I don't understand much of this but I really appreciate the three of you coming together to try and explain it ❤️


@lesbunny@urusai.social Glad it's useful! It's been an ongoing conversation for the last six years so there's a lot of context. I wrote about some if it in https://privacy.thenexus.today/mastodon-privacy-remember-that-public-and-unlisted-posts-can-be-indexed-by-search-engines/ but that's only the tip of the iceberg.

@Curator@mastodon.art @davidgerard@circumstances.run


@jdp23 @Curator @davidgerard oh geeze that's so messy. if I understand this correctly, then people might start incorrectly blaming others for their non-indexed posts becoming searchable, just because of the method of interaction a searchable person used with them. that could cause people to fight, thinking they have a difference in values (like privacy, respect, etc) when in reality it's a difference in systems way above both their heads.


@jdp23 @Curator @lesbunny i think you need to be clearer that we're talking literally about "is an instance allowed to search its own hard drive". We are talking about a function to allow an instance to search its own hard drive.

You are condemning people as not caring about privacy concerns on the basis that you're worrying about whether someone else is allowed to search their own hard drive, and condemning those who point out that that's the thing you're actually talking about.

What are the moral implications of being allowed to search your own hard drive?

You're misrepresenting to people that you care more because you claim to have the power to stop other people searching their own hard drive.

I submit that this is an inane claim, and as this thread progresses it becomes clearer that that actually needs to be pointed out.

Certainly you rely on the good faith of others not to abuse the power of having local disk caching switched on, as it usually is. But you were literally never not doing that. If you represent to people that you were ever not doing that, you're lying.


@davidgerard @jdp23 @Curator @lesbunny I'm not sure I buy this framing, unless we consider the hard drive of every instance to be the communal property of its members.


@davidgerard@circumstances.run I think you need to be clearer that you are literally buying into the same framing that surveillance capitalism companies use, that once they get access to data for one purpose they can continue to use it for whatever purpose they want without getting meaningful consent.

You're misrepresenting to people that "construct a database on a server for the purpose of algorithmically analyzing all the posts received by a server without consent" is equivalent to somebody searching their hard drive. What are the moral implications of the parallel construct that "training an AI model on all data received by a server without asking consent" is equivalent to searching a hard drive? If you represent to people that you were not doing that, you're lying.

Certainly you rely on the good faith of surveillance capitalism companies and others not to abuse that power but I submit that this is an inane position, and as your post progresses it becomes clearer that that actually needs to be pointed out.

As always, thanks for the conversation!

@Curator@mastodon.art @lesbunny@urusai.social


@jdp23 @Curator @davidgerard I had a thought. If the instance servers really are privately owned, regardless if they're publicly used, then the only difference between mastodon and twitter is quantity.

Like if an instance admin really is allowed to do whatever they want with something mostly used by other people (whether we want that or not, just talking about real not ideal) then Twitter is monopoly class and Mastodon is a ton of startup class. Both are private propertied class as a whole, so the class relations don't qualitatively change. Only how powerful individual owners are.

If the goal is to hand power to the collective users...


@jdp23 @Curator @davidgerard then small reforms like searchability etc will never change the relationship to that type.

Rather, the people forcibly take over and make Twitter public property like a park or library. In that case, its centralization would work for the people instead of against them. Fighting that power with pure quantitative reduction (private mastodon) is just running away from the root of the problem: private ownership. Not monopoly, which would be progressive if monopoly is public (like public education)

But if instance servers are truly owned by the users, as a collective, then the relationship itself is different.


  1. META  ↩︎

  2. Threads  ↩︎

  3. fulltextsearch  ↩︎

  4. hashtagsearch  ↩︎

  5. ChangingFediverse  ↩︎