Go4Hive

Steem Sincerity - Improved Anti-Spam API

BY: @andybets | CREATED: March 26, 2018, 12:09 p.m. | VOTES: 522 | PAYOUT: $373.02 | [ VOTE ]

[IMAGE: https://steemitimages.com/DQmZejyqAnYoWWMw7iiyuNUdrUGr1ckbpTZkSpuhZ9KHF3d/image.png]

Steem Sincerity is a project aimed at helping to address the spam problem we have on Steem.

As I explained in my introductory post there are three aspects to this. This post discusses the most important aspect in more detail.

Public API for Developers

This is a service hosted on my server(s), which can be queried by any front-end website or app to obtain information about Steem accounts. It uses a database which stores the last 7 days worth of posts, comments and votes.

Periodically the software extracts meta-data (data about the data) from these accounts, and much of this can be easily accessed by application developers using the methods here. The meta-data for each account is also fed into a kind of artificial intelligence software which looks at how it compares to other known spamming and bot accounts, so it can 'classify' each active account.

What is classification?

In machine learning, classification is an approach in which the computer program learns from the data input given to it and then uses this learning to classify new observations. So from our perspective, we first 'train' the classifier by giving it three lists of Steem accounts that have been manually classified as either Human Content Creator, Spammer or Bot.

It is programmed to be able to extract the relevent meta-data - or what are called features in machine learning - from these accounts. Some of the many features used in the Steem Sincerity software are: number of comments, number of posts, average number of downvoted comments, average word length etc. It looks at how these features vary between the different classes of account, and makes rules for itself to use when deciding about how to classify accounts that it hasn't seen before.

The classifier has currently been trained using only around 30 accounts of each type, and has a cross-validation accuracy of around 78%, and very little non-spam is classified as spam. Cross-validation is a standard technique for evaluating the accuracy of a classifier, but of course what constitutes spam is highly personal, so inevitably my preferences will have introduced biases. A larger crowdsourced training set is planned to reduce this bias in the near future.

Rather than making a direct prediction about whether an account belongs to a spammer, the API actually returns the probabilities of the account belonging to each of the three classes. For example an account may show the following classifications scores:
> Human Content Creator: 45%
Spammer: 45%
Bot: 10%

Each front-end using the API can make its own decision about what should happen at different spam thresholds. For example, it could fade the comment if the spammer score is between 40-70% and hide it altogether if the score exceeds 70%. It could even leave this up to the user to decide.

[IMAGE: https://steemitimages.com/DQmSRhEPKfUiPW29D6MdneVtkETUxyTd23wbhoPaX9RwayZ/image.png]

This is a very simple illustration of how accounts with comments containing certain combinations of features may be classified as spam. The red dots represent real spamming accounts, and the pink area shows the accounts which are classified as spammers. The accuracy is not perfect, but good enough to be useful. In practice the machine learning algorithm used by the Sincerity software uses far too many features to be able to show in a two-dimensional diagram.

API Specifications

If you are a developer, you can find the API specification here. There are currently 10 methods, and since the main intention is to help improve front-end user experiences, performance is prioritised over the having larger amounts of historical data. Currently no API keys are required, and request rate limiting is fairly relaxed, but this may need to change depending on future demand.

Main Methods

/api/accounts-info/account1,account2,account2

This expects a comma separated list of accounts, and returns various useful meta-data about the accounts. This includes the probability that each account is a: Human Content Creator, Spammer or Bot. It also includes some metrics about the commenting and voting behaviour of the accounts. Note that only accounts which have commented in the last period will have records in the database. Because up to 100 accounts can be queried at a time, this is the most useful method for hiding or changing the appearance of spam in your application.

/account-full-info/account1

This returns the complete analysis information that are held for the account specified. There are many fields, a few of which are unused. You may want to query this when an account profile is clicked for example.

/account-comments/account1

Returns a time-sorted list of the comments made by the specified account in the last 7 days.

/account-outgoing-votes/account1

Returns a time-sorted list of the votes made by the specified account in the last 7 days.

/account-outgoing-downvotes/account1

Returns a time-sorted list of the flags given by the specified account in the last 7 days.

/account-apps-used/account1

Returns the list of apps the specified user has used to post and comment in the last 7 days.

/biggest-spammers/

Returns the 500 accounts most likely to be spamming accounts. This may be useful for stakeholders employing bots to clean up the platform.

There are a few other methods, and I will add more over time.

I'll be improving the Chrome Sincerity extension soon, to use some of these new methods.

If you have other requirements for a different API method or need to apply machine learning to different data, I'd be delighted to work for STEEM ;)

TAGS: [ #steemdev ] [ #steem ] [ #spam ] [ #api ] [ #steem-sincerity ]

Replies

@steevc | March 26, 2018, 12:14 p.m. | Votes: 2 | [ VOTE ]

We definitely need anti-spam tools as it's already an issue. I'd like to see ways to use the list of problem accounts to flag them and prevent them profiting. That should help discourage it. All the best with this project

@andybets | March 26, 2018, 12:32 p.m. | Votes: 4 | [ VOTE ]

I agree. The 'Biggest Spammers' list is only useful if a bot decides to use it. I'll also add a list of account links to steemreports soon, and after that, maybe even some kind of interface that uses SteemConnect to make it really quick and easy to flag spam manually.

@soulbella | March 26, 2018, 12:40 p.m. | Votes: 3 | [ VOTE ]

I am now very cautious in commenting post and tapping any link. Just like this account @tomole444. Every time I saw his/her comment it scares me.

@tts | March 26, 2018, 12:41 p.m. | Votes: 2 | [ VOTE ]

To listen to the audio version of this article click on the play image.
[IMAGE: https://s18.postimg.org/51o0kpijd/play200x46.png]
Brought to you by @tts. If you find it useful please consider upvote this reply.

@crypt0cry | March 26, 2018, 11:51 p.m. | Votes: 1 | [ VOTE ]

Not yet. Would u consider target=_blank or any other magic that won't take me away from the source page, i'll upvote it every time. In fact, could u link it to upvote automatically, i'd use it with joy.

@tts | March 27, 2018, 12:44 p.m. | Votes: 1 | [ VOTE ]

steemit block target=_blank :-(

@crypt0cry | March 28, 2018, 2:18 p.m. | Votes: 0 | [ VOTE ]

Yeaaah, i wanted 2 apologize and upvote anyway, even 2 a bot ;)
My response was just rude, it actually is a very useful tool and u cannot satisfy every user with his own shenanigans.
P.S. For the Chrome multiTab lovers: TabActivate; CLUT; Peek-a-Tab

@jacek-w | March 26, 2018, 1:20 p.m. | Votes: 1 | [ VOTE ]

This is really awesome! What ML algorithm did you use?

@andybets | March 26, 2018, 3:35 p.m. | Votes: 1 | [ VOTE ]

Thanks. It's currently using k-nearest neighbors, but I'm still investigating what works best.

@cardboard | March 26, 2018, 4:16 p.m. | Votes: 2 | [ VOTE ]

This is very nice post... Lol :D
Thank you for your work, tip!

@tipu | March 26, 2018, 4:16 p.m. | Votes: 0 | [ VOTE ]

Hi @andybets! You have received 0.3 SBD tip from @cardboard!
You can now delegate SP / invest in @tipU for daily profit:) Click here to learn more :)

@postpromoter | March 26, 2018, 9:21 p.m. | Votes: 0 | [ VOTE ]

You got a 59.83% upvote from @postpromoter courtesy of @devfund!

Want to promote your posts too? Check out the Steem Bot Tracker website for more info. If you would like to support the development of @postpromoter and the bot tracker please vote for @yabapmatt for witness!

@devfund | March 26, 2018, 9:41 p.m. | Votes: 9 | [ VOTE ]

This post was funded/promoted by @DevFund using a budget of about 360.00 USD on voting bots.

100% of the money sent or earned via upvotes to this account will be powered down and used to give back via promotion bots to Steem ecosystem development initiatives like this one.

https://steemit.com/@devfund/comments

@tomole444 | March 26, 2018, 9:52 p.m. | Votes: 2 | [ VOTE ]

Awesome post! I like it :) 👍
[IMAGE: https://steemitimages.com/DQmQhtuSujcFs86rvzidRnNKvVVdcYbK34fp3QR8qySUchn/upvote-bild-fertig.gif]
You got an upvote!
For more upvotes follow this account 👊🏼

@deep.throat | March 26, 2018, 9:53 p.m. | Votes: 1 | [ VOTE ]

Bid Bots and Vote Selling Disclosure

For your information, this post by @andybets has been heavily advertised using bid bots or vote selling services!

In fact, at least 161 SBD have been spent for getting this post hot and trending.

Remember that in the age of bid bots a high reward is not equivalent to good content. So be vigilant when looking at the trending and hot Steemit sections! I will help you by scanning all transactions to bid bots and making post promotions transparent and visible with comments like this one.

@yabapmatt | March 26, 2018, 10:05 p.m. | Votes: 3 | [ VOTE ]

All I can say is: wow this is freakin cool! I am going to add this to my list of things to integrate into the post promoter voting bot software!

@andybets | March 27, 2018, 6:52 a.m. | Votes: 0 | [ VOTE ]

Great! Let me know if you would like any changes on my side.

@crypt0cry | March 27, 2018, 9:11 a.m. | Votes: 1 | [ VOTE ]

Hi @andybets! Although I'm very excited about the API, as a frontend user my perspective is userish: I would rather prefer it "onload" than "onclicked".
chrome.browserAction.onClicked.addListener
If a user installs the extension, she wants it 2 b active by default. Correct me if i'm wrong.

@andybets | March 27, 2018, 9:35 a.m. | Votes: 0 | [ VOTE ]

Thanks for the input. I actually ask about this issue here (or maybe that's where you saw it?):
https://github.com/andybets/steem-sincerity-chrome-extension/issues/1

I think I now understand how this should work, and will start working on the next version of the Chrome extension soon. I think I may not even need the background page, but am very new to Chrome development.

@crypt0cry | March 27, 2018, 8:33 p.m. | Votes: 1 | [ VOTE ]

Saw it now, sorry, wasn't aware of ur awareness:)
2 your concern of load on the API, i think load is the first indicator of success and worth thinking about. like some sort of incentive 4 users 2 share their comp's resources... but as i'm diving deeper, it bcomes clear 2me, that i'm trying 2 reinvent steem and that job is already done, pretty fucking well.
If i can help u by my old cpu/hd/bandwidth and even 4 redundancyz sake, i'll gladly do.

@cliffpower | May 3, 2018, 3:53 a.m. | Votes: 0 | [ VOTE ]

Hi I'm confused why it said I was 60% spam? All my post are encouragement and from the heart? Is there something I don't know?

@andybets | May 3, 2018, 7:57 a.m. | Votes: 0 | [ VOTE ]

Hi, your account @cliffpower is not classified as spam:
http://steemreports.com/sincerity-accounts-info/?accounts=cliffpower

...do you have another that you are referring to?

@cliffpower | May 3, 2018, 3:59 a.m. | Votes: 0 | [ VOTE ]

What about the guy who does'nt pay, is there something I can do? I'm new at steem since January and still figuring this all out. Now we have spam police who just seem to steal your money. @buildawhale and @smartsteem did the same thing to me? do you have any advice :) [IMAGE: https://steemitimages.com/DQmU7PoWsAMzsALqWA8P8tiPB4jkiYrsAxBvKbsioo5kciw/%40smartsteem%20owing.png]

THANK YOU, I just want to be a good player :) I'm one man one account.

@themarkymark | May 3, 2018, 4:25 a.m. | Votes: 0 | [ VOTE ]

You are not on the @buildawhale blacklist, so I don't know how you can claim we stole money from you.

@cliffpower | May 3, 2018, 4:37 a.m. | Votes: 0 | [ VOTE ]

Why is it I don't get paid for a post from you and a post from @smartsteem. Maybe it's my mistake but I can't see what I'm doing wrong? Why didn't @smartsteem pay? Thats 70 steem dollars I invested
@[IMAGE: https://steemitimages.com/DQmU7PoWsAMzsALqWA8P8tiPB4jkiYrsAxBvKbsioo5kciw/%40smartsteem%20owing.png]

@cliffpower | May 3, 2018, 4:40 a.m. | Votes: 0 | [ VOTE ]

Thank you for replying

@themarkymark | May 3, 2018, 6:47 a.m. | Votes: 0 | [ VOTE ]

You got an upvote worth $151 from SmartSteem.

https://i.imgur.com/9AOjjG9.png

I don't see when you used my bot (@buildawhale) but it is the same thing. We respond with an upvote, we don't give you cash back. If that was the case, we would just use it for ourselves to print unlimited money and open a theme park on the moon.

@cliffpower | May 3, 2018, 10:01 p.m. | Votes: 0 | [ VOTE ]

That was the first bid O did and he paid but the second one he did not pay. I was expecting a vote value of $365.87 to be sent to my blog post? I'm going to take some time and read how all this works. Even though I've received upvotes I don't fully understand it. Cheers

@themarkymark | May 3, 2018, 11:30 p.m. | Votes: 1 | [ VOTE ]

You are looking at an estimate if no one else bids after you (which is almost never the case).

@berniesanders | March 26, 2018, 11:04 p.m. | Votes: 22 | [ VOTE ]

This is FUCKING SPECTACULAR! Thank you for putting this together.

BEWARE SPAMMERS, NOW THAT I CAN FIND YOU, I WILL COME AFTER YOU.

@drakos | March 27, 2018, 1:56 a.m. | Votes: 2 | [ VOTE ]

/biggest-spammers/ better run...

https://img.thrfun.com/img/164/250/dog_hiding_l1.jpg

@vishalsingh4997 | March 27, 2018, 3:38 a.m. | Votes: 2 | [ VOTE ]

Nice demonstration, @drakos. Ha ha..

@andybets | March 27, 2018, 6:48 a.m. | Votes: 0 | [ VOTE ]

Thanks for the support! Let me know if you want different views of the data.

@on247 | March 27, 2018, 9:54 p.m. | Votes: 9 | [ VOTE ]

The pot calling the kettle black.

@jheff-shayd | March 26, 2018, 11:04 p.m. | Votes: 0 | [ VOTE ]

Thank you for your tireless effort invested into thiswork to update us @andybets

@vander | March 26, 2018, 11:17 p.m. | Votes: 1 | [ VOTE ]

Thank you, this is what steemit needs!

@crypt0cry | March 26, 2018, 11:20 p.m. | Votes: 1 | [ VOTE ]

Didn't finish 2 read yet, just browsed to the methods and was awed with an urge 2 thank u. Now back 2 the article :)
OK, the rest of the article was what i already saw, but the extension is a real candy!

@sancti | March 26, 2018, 11:43 p.m. | Votes: 0 | [ VOTE ]

Nice reseearch..
We need to eliminate spammers from steemit
I'll be willing to join in searching for more methods.

@themarkymark | March 26, 2018, 11:45 p.m. | Votes: 1 | [ VOTE ]

I was originally planning on building something like this for a global blacklist before finding was cut. When I am on a desktop I’ll check it out.

@topmoststeem | March 27, 2018, 1:29 a.m. | Votes: 2 | [ VOTE ]

Its reall to like awesome work

@daifenghai | March 27, 2018, 1:40 a.m. | Votes: 0 | [ VOTE ]

Thanks for your nice sharing ,it is a very good post

@elmac | March 27, 2018, 1:49 a.m. | Votes: 0 | [ VOTE ]

This a helpful information. Thanks @andybets for the education

@apply | March 27, 2018, 2:01 a.m. | Votes: 0 | [ VOTE ]

thats really helpful information Thankyoh

@jga | March 27, 2018, 2:16 a.m. | Votes: 1 | [ VOTE ]

This tool is amazing, thank you.
And the best part is that is developed it like an API. In this order, many of us can use it in our tools.

I have created a new tool called Custom Feed. Where you can filter posts by reputation, resteems, payout, number of votes, comments, body length, tags, authors, among others.

In this order, it will be more easy for you find the content that you want to read. Maybe you are interested in it. Details here.

@nabila17 | March 27, 2018, 2:24 a.m. | Votes: 0 | [ VOTE ]

good post...

@langsa | March 27, 2018, 2:49 a.m. | Votes: 1 | [ VOTE ]

One men one account ?

@eloyibarra | March 27, 2018, 2:50 a.m. | Votes: 1 | [ VOTE ]

to remove the spam we need to pay a moderator to eliminate spam accounts but that would no longer be unraveled

@aiman | March 27, 2018, 2:58 a.m. | Votes: 0 | [ VOTE ]

Very helpful post for understanding how to use steemit account. I am thankful to you I will follow on your instructions. I understand that this platform is very helpful for those who are true accounts. Thanks again for sharing great information.

@ernhaquen88 | March 27, 2018, 3:10 a.m. | Votes: 0 | [ VOTE ]

Wow spectaculer

@pushpendra83 | March 27, 2018, 4:02 a.m. | Votes: 1 | [ VOTE ]

Nice job
Now all steemian may be secure

@bobcastleman | March 27, 2018, 4:16 a.m. | Votes: 2 | [ VOTE ]

This is a nice piece of work! Have you been able to get the training data sets you were looking for?

@andybets | March 27, 2018, 6:55 a.m. | Votes: 0 | [ VOTE ]

Not yet. I didn't get a lot of interest, so I'm devising the best way to crowdsource it.

@salcoo | March 27, 2018, 5:32 a.m. | Votes: 0 | [ VOTE ]

Nice !
Lol i jolie d'or sûre it's a real problem the community shouldn't be the center of the action dev should dix ans be more open to problem about this platform

@teenovision | March 27, 2018, 5:52 a.m. | Votes: 1 | [ VOTE ]

let us hope that we all will be safe right now

@sanjeevm | March 27, 2018, 6:11 a.m. | Votes: 7 | [ VOTE ]

What are the use cases of these ? Is it like people can see and upvote or flag accordingly ? or is this meant for @steemcleaners?

@andybets | March 27, 2018, 6:49 a.m. | Votes: 0 | [ VOTE ]

It has many uses which app developers will decide, but one is that it can be used for re-rendering comment sections in front-ends to hide spam.

@steemreports will shortly have some tools to display this info for end-users.

@sheorath | March 27, 2018, 7:07 a.m. | Votes: 1 | [ VOTE ]

So, are you considering me as spam ?

@andybets | March 27, 2018, 7:23 a.m. | Votes: 0 | [ VOTE ]

The software just provides spam probability scores. How apps decide to use those is up to them. That said, I think your account's spam score at 43% looks too high, so will add you to the training set for next time.

@maxruebensal | April 14, 2018, 7:54 a.m. | Votes: 0 | [ VOTE ]

Same for me suddenly...

@smcaterpillar | March 27, 2018, 8:53 a.m. | Votes: 1 | [ VOTE ]

Hi, this is a pretty cool initiative. Any steps or ideas to classify posts directly as spam instead of accounts?

By the way are you reachable on any discord server? I'm working with Machine Learning on Steemit Blockchain data, too. Mainly trying to find good content, rather than punishing bad actors :-D. I'm interested in exchanging ideas if you like.

@smcaterpillar | March 27, 2018, 9:08 a.m. | Votes: 0 | [ VOTE ]

Btw, I found a bot that tries to achieve a similar goal to your initiative (maybe you can get in touch, too):
https://steemit.com/introduceyourself/@duplibot/introducing-duplibot-reducing-rewards-on-comment-spam

(By the way I did find this by using my own content search bot @hounddog ;-) cough.

@andybets | March 27, 2018, 12:41 p.m. | Votes: 0 | [ VOTE ]

I decided that since, unlike email, the senders of messages can't be spoofed in Steem comments, and that account intentions would change slowly if at all, that accounts were a better level of granularity than comments for classification. I do see a lot of merit in an additional layer for scoring individual comments though, and these could in fact feed into an account classifier.

I'm in the steemdevs server in discord, but am not very familiar with it, and also steem.chat as @andybets. I'd be interested in what you're working on. :)

@reggaemuffin | March 27, 2018, 9:30 a.m. | Votes: 9 | [ VOTE ]

This API sounds awesome!

Maybe MB will use this in the coming days to detect abuse ;)

@littlescribe | March 31, 2018, 3:57 p.m. | Votes: 0 | [ VOTE ]

Hey reggae, did you notice @art-universe made a painting of you?

[IMAGE: https://steemitimages.com/DQmNMf4HsdtcgeUK5ybdTzXr38b61fqhSJP81PM82G3cLQE/image.png]

here's the link to the original post if you wanna go check it out.

@reggaemuffin | March 31, 2018, 3:58 p.m. | Votes: 3 | [ VOTE ]

definitely noticed :)

@hiroyamagishi | March 27, 2018, 11:44 a.m. | Votes: 3 | [ VOTE ]

We definitely need to support this amazing project. For this community gets bigger and bigger some users are abusing it. Well done @andybets

@murhadi9 | March 27, 2018, 2:03 p.m. | Votes: 0 | [ VOTE ]

Nice post i like it @andybest

@taufiksagoe | March 27, 2018, 2:13 p.m. | Votes: 0 | [ VOTE ]

nice logo... can use it for my blog?

@beautylove | March 27, 2018, 2:34 p.m. | Votes: 1 | [ VOTE ]

nice post

@tysler | March 27, 2018, 3:04 p.m. | Votes: 1 | [ VOTE ]

Wow nice development for Steemit :) Personally I do ML using randomforest for classification problems. Key issue is to have sufficient features yet not overfit my predictions. Luckily RF does output probability scores for each classification so it makes it easier to set different thresholds.

Depending on how much data you have in your training set, I guess I can take a look at your APIs. Not sure if I could contribute as I am also tired of those pesky spammers while doing nothing about it. Im sure your work can help create whitelists and blacklists or give out a “spam” rating for every user. Ratings should be kept below certain threshold. It could probably help mirror the reputation but focused on catching spam.

Have a good day and hope we can chat a bit more about this implementation.

@andybets | March 27, 2018, 8:14 p.m. | Votes: 0 | [ VOTE ]

Thanks. I'm not familiar with random forest, but I see it relates to nearest neighbours, which my implementation uses. My training set isn't really big enough for it to be highly robust/unbiased, but I hope to fix this soon.

@sarakhani | March 27, 2018, 3:27 p.m. | Votes: 0 | [ VOTE ]

nice good work

@arcange | March 27, 2018, 4:01 p.m. | Votes: 0 | [ VOTE ]

Congratulations @andybets!
Your post was mentioned in the Steemit Hit Parade in the following category:

Pending payout - Ranked 5 with $ 430,67

@srefat | March 27, 2018, 4:38 p.m. | Votes: 0 | [ VOTE ]

We definitely need to support this amazing project. very good post. tnx for this post

@casualmatt | March 27, 2018, 6:07 p.m. | Votes: 0 | [ VOTE ]

One of the best ways to clean up the system would be to prevent people from buying power into the system. Work for your power, don't buy it.

@elevator09 | March 27, 2018, 6:55 p.m. | Votes: 0 | [ VOTE ]

great work to improve steemit! upvoted

@upvotewhale | March 27, 2018, 7:37 p.m. | Votes: 0 | [ VOTE ]

This post received a $2.100 (97.23%) upvote from @upvotewhale thanks to @gex231! For more information, check out my profile!

@baktiarsejahtera | March 27, 2018, 11:34 p.m. | Votes: 0 | [ VOTE ]

I see, read and enjoy your nice post innovative add knowledge, thank you for sharing.

@stoodkev | March 28, 2018, 1:11 a.m. | Votes: 2 | [ VOTE ]

Hi, awesome work! Would you also like to have users input on this?
I am thinking about using this on SteemPlus extension (currently about 1600 active users) and could code something to report spammers / bots to your API if you want to take human feedback into account. You can contact me on Steem.chat/Discord if you're interested.
EDIT: self voting for visibility

@andybets | March 28, 2018, 5:45 a.m. | Votes: 0 | [ VOTE ]

That'd be excellent! I was thinking about the possibility of adding that to my very simple extension, but since yours is much better than I could do, and you have lots of active users, it makes great sense. I'll be in touch. :)

@stoodkev | March 28, 2018, 7:43 a.m. | Votes: 0 | [ VOTE ]

Great! Waiting for your message then.

@samueljmf13 | March 28, 2018, 2:16 a.m. | Votes: 1 | [ VOTE ]

I would like to see more interesting facts like this, do not you?

@tomasbonillo | March 28, 2018, 3:09 a.m. | Votes: 0 | [ VOTE ]

excellent post, I love your work really seems to me of very good content, keep it up, congratulations I always follow your work

@idrisalban | March 28, 2018, 5:04 a.m. | Votes: 1 | [ VOTE ]

very amazing

@idrisalban | March 28, 2018, 5:07 a.m. | Votes: 1 | [ VOTE ]

very amazing

@farizalm | March 28, 2018, 8:46 a.m. | Votes: 2 | [ VOTE ]

Useful postings. thank you @andybets

@fraenk | March 28, 2018, 9:04 a.m. | Votes: 1 | [ VOTE ]

AWESOME! I applaud this so much!

but I am still sad my cuddle-bot (delivering tons of upvotes and barely ever leaving a comment without upvoting) has made it into the top 500 biggest spammers

[IMAGE: https://steemitimages.com/DQmRjKTDw7VRCmMyU7Nz9GvH5LLY8a5TWMvgaz3qbsaRod7/image.png]

I don't think there's many who interact with the kitten who actually see her as spam... but I understand how an algorithm may get to that impression.

P.S.: maybe some of this data could be incorporated into the rating to determine how spammy an account actually is?!

[IMAGE: https://steemitimages.com/DQmR6utX8vW5VgirKG7G73whXL3RA9TyJPpDtRZqaYxrkJN/image.png]

at the moment a very obnoxious spammer (@tomole444 for example) does not make it into the top 500 (despite ~5k comments and ~650 flags received) while my cuddle-delivery service does get caught with "only" 160 comments and zero flags!

@andybets | March 28, 2018, 10:31 a.m. | Votes: 1 | [ VOTE ]

I'd also like to have seen this account with a lower spam score, and higher bot score. I will add it to the bot training set. ;)

These factors you mention are included in the rating, but the accuracy and effectiveness is limited by the training set, which I'm in the process of expanding.

@fraenk | March 28, 2018, 11:07 a.m. | Votes: 1 | [ VOTE ]

I see! Thanks for the feedback... expanding the training data by reasonable but not too biased examples will be the major challenge (as it seems to be with AI).

I'm curious to see how the detection will improve over time. Thanks a ton for the efforts you are making on this!

@alyssas | March 28, 2018, 2:14 p.m. | Votes: 6 | [ VOTE ]

So people will be able to run bots that only look at the list created?

@andybets | March 28, 2018, 2:31 p.m. | Votes: 0 | [ VOTE ]

They can use any of the APIs listed and some allow them to check any account's classification scores. There will be more coming over time.

@cortexx | March 28, 2018, 3:15 p.m. | Votes: 2 | [ VOTE ]

thanks @andybets this will make the steemit community a better place to be . :)

@jestemkioskiem | March 28, 2018, 11:47 p.m. | Votes: 2 | [ VOTE ]

Are there any plans to release the source code under an open source license?

@andybets | March 29, 2018, 5:54 a.m. | Votes: 0 | [ VOTE ]

I am considering this, but haven't decided yet. If so, I would like to recover more of my development costs before it happens.

EDIT: I should also say that there is a cost to this in that spammers would then be better able to circumvent any measures that may arise from their adverse spam scores.

@jestemkioskiem | March 29, 2018, 3:11 p.m. | Votes: 1 | [ VOTE ]

Sounds completely reasonable! I'm a representative from Utopian.io - sounds like you might just be looking for us if you want to go open source and get rewarded in steem.

@andybets | March 29, 2018, 3:39 p.m. | Votes: 0 | [ VOTE ]

Thanks. I've used Utopian for some software it's great. I'm just unsure about this project.

@jestemkioskiem | March 29, 2018, 5:22 p.m. | Votes: 0 | [ VOTE ]

No problem, I felt kinda weird advertising us there, I just came to comment since I like the project. I just figured you might want to consider this as an option. Let me know if you do decide to go open source!

@viking-ventures | March 29, 2018, 3:44 p.m. | Votes: 1 | [ VOTE ]

This sounds really great! I get tired of some of the known spammers and bots out there.

What will really improve things, is when bloggers can do better than "mute" but can actively block certain people from commenting on their blog. Spammers need prey and the prey being able to better defend themselves would be great.

Until then, your efforts to reduce spam are greatly appreciated!

@andybets | March 29, 2018, 6 p.m. | Votes: 0 | [ VOTE ]

I agree with this. I totally understand why you can't stop people voting or downvoting where the reward pool is concerned, but see no reason that people should need to allow everyone to comment on their posts.

@viking-ventures | March 29, 2018, 9:13 p.m. | Votes: 0 | [ VOTE ]

I've had a couple of unwanted bots commenting on my posts. One was that "catfacts" bot who puts useless trivia about cats in the comments of anyone who uses the word "cat" as a tag. (Mute #1 for me.)
The other is the "cheetah" bot which gets some respect from what I have seen, but in my case, all it could do is provide the link I'd already provided in my article! (Mute #2.) Another person I've ended up muting because the guy puts useless comments everywhere. He can still comment on my posts though, which is annoying. I know I'm "preaching to the choir", but I know you understand where I'm coming from.

@kakra98 | March 31, 2018, 10:27 p.m. | Votes: 1 | [ VOTE ]

Wow I like your style. I'm a beginner your post just amazes me.

@maxruebensal | April 14, 2018, 7:52 a.m. | Votes: 1 | [ VOTE ]

@andybets great idea, but for many people (like me) it could also mean less visibility. For some reason, I was human before but now I am identified as spammer (which is pretty weird as I haven't been active in the last couple of days) and there's rreally not much you can do about it..

@andybets | April 14, 2018, 8:46 a.m. | Votes: 0 | [ VOTE ]

Sorry for this inaccuracy, it is clear to me you're not a spammer, so I've added you account name to the training data. When the next version is released your scores should improve.

@maxruebensal | April 14, 2018, 8:49 a.m. | Votes: 1 | [ VOTE ]

Thank you so much! I was also wondering how the "personal voting option" for the steemplus extension plays a role in it? Light how much is the voice of a personal voter weighted against the api?

@andybets | April 14, 2018, 11:09 a.m. | Votes: 1 | [ VOTE ]

The data from SteemPlus is used to help form the training data that informs the API what spamming and bot accounts look like, so it can make estimates about the othr thousands of accounts that it isn't given a classification for. There are various other data sources as well as SteemPlus though.

@maxruebensal | April 15, 2018, 9:13 a.m. | Votes: 0 | [ VOTE ]

Oh wow, great! Thanks for making that clear!

@kalemandra | May 8, 2018, 4:42 a.m. | Votes: 1 | [ VOTE ]

Hi, i've just found Steem Sincerity in SteemPlus, I've been using it for 2 days now. This is a great tool i think. But i have a question. How can it be calculated? One of my friend is a newbie steemian, @zitus. She made only some posts but she is considered as a 38.14% human, 34.97% spammer and 26.89% bot. And me, as 58.40% human, 40.00% spammer(!!!)and 1.60% bot. Well, i frequently use the same phrases, like "dear Steemies, today is orange, TuesdayOrange" (and other colors for each days of the week)because it's more comfortable for me than formulating different English sentences. That's a hard effort for me because my English is not so good. And recently i made much more posts, 6-7 a day (but they were all good quality posts) Other question: does it count, that i use upvote bots, 3 times daily?

@andybets | May 9, 2018, 3:27 p.m. | Votes: 0 | [ VOTE ]

Hi , these scores are indications or probabilities which applications developers can use in their interfaces for excluding or penalising accounts considered to the spammers. Many will not take any action until the spammer score is above 70%, so you don't need to worry about this. New accounts have baseline probabilities, which are 40% human, 30% spammer and 30% bot, and as you interact with the platform they are re-evaluated.

Here you can see your current scores:
http://steemreports.com/sincerity-accounts-info/?accounts=kalemandra%2C+zitus

Only accounts in the 'Spammer' triangle, may be penalised by some app developer is they're using the Sincerity API.

[ BACK TO TRENDING ] [ BACK TO MENU ]