At the beginning of this week I was asked by a journalist from the Wall Street Journal about my opinion on bot-generated articles and the activity of Lsj and his bot Lsjbot in the swedish wikipedia as an addition to the answers she got from Lennart Guldbranson, a well known wikipedian from Sweden. The following is what I answered:
You asked me to answer some questions concernig the bot-generated content
since I may have a more critical opinion on Lsjbot and
bot-generated content than Lennart. To claim first: I am not active in the Swedish
Wikiepdia so my view is more general on the topic than focussed on sv.wikipedia.org. To introduce myself: I am mostly active in the German Wikipedia where you can find my profile and work at https://de.wikipedia.org/wiki/
I do not know Lsj (the user who started with bot-generated contend via his bot Ljsbot) personally and the only work from him I know is the topic on bot-generated content in several wikipedias. I am sure he is doing his best to bring Wikipedia forward and that the start of this project for him is a step to make Wikipedia better again. So from my point I am not in opposition to him as a person even when I am critizising hin in this special topic.
For readers an article like e.g. Yungasia_tricolor (random article) does not really help if someone is searching for informations. This article only transports more or less correct taxonomic notes on a species name - it does not help if you want to know something on this species - how it looks like, where it lives and how it lives. So if really someone searches for this special species of leafhoppers (what only can be expected for experts in entomology) he will not find any needful information on it that helps - the information given is not better than no information. I would expact that an author at least would tell me where I can find it (Brasil as you can read in the easy to find first description of Zanol 1991) - the main argument that this article is better than none does not count for me.
For quality reasons I prefer to have less good than millions of non-articles. To compare: Rüppellfuchs was the work of weeks for me and may be an extreme but even articles like Rot-Weißes_Riesengleithörnchen (2 hours work at maximum) show what should be the goal - compare them with the svedish one Petaurista_alborufus (and please compare the number of subspecies that is 0 in sv and 6 in my article).
For browsing users who will find this
article by random it is the same - the article is boring to read and 99%
of other random articles are the same (try slumpartikel).
For authors:
It is often claimed that if there is a short article (stub) new users
will come and expand this - my experience tells another story: It is
unattractive to expand an existing stub since most users will try to
find niches where to start articles from scratch - articles that are not
existing are the best way to persuade authors to bring in their
knowledge with the start of a new article and even if this may be the
same quality as a bot-content it is worth more since it is the start of a
potential new author. The best way to discourage potential authors is
to present them a field of thousands of pseudo-articles with always the
same structure where his own work will get lost between all these and
will not be find by others. I think: yes, you can have a
1.5-million-articles wikipedia if you use bots but this will lead to the
decrease or even death of user activity in the areas you try to fill
with this action. If the German WP were populated with those stubs I
don't think that I would be interested to work in this with my
knowledge.
To have a compromise: I would think it would be a
good idea to use Ljsbot and others to fill in datasets in the WikiData
project and provide those data to the authors for example when they
start an article to choose if they want to use it. This could increase
the quality of WikiData as a database without flooding the wikipedias
and could be a valuable addition. The only fields where I could imagine
boticles are areas where nothing more is existing than database entries
(e.g. galaxies) or for some geographical additions - but also there I see
more problems than positive effects.
It is a bit longer than I
expected now. Sorry about my not that good english but I think one can
understand my points. If there are any open questions please don't
hesitate to contact me.
Best regards,
Achim
Keine Kommentare:
Kommentar veröffentlichen