User talk:DaxServer
|
empty description
[edit]Otherwise empty descriptions seem to create a few problems [1][2][3][4]. Maybe this is sufficient to track them. I fixed all it found. Enhancing999 (talk) 11:54, 7 August 2024 (UTC)
- Thanks for the headsup. I updated the regex pattern. Let me know if you find other errors. I'll keep an eye on that parsing error category ;) -- DaxServer (talk) 13:11, 7 August 2024 (UTC)
- It seems to go quite smoothly, but there seem to be ever more than I first thought. I made a category for potential problems: [5]. Enhancing999 (talk) 14:12, 7 August 2024 (UTC)
Stray text
[edit]Just noticed that some files have a the template text all over: [6]. I can try to find and fix them after the bot run. Enhancing999 (talk) 10:48, 8 August 2024 (UTC)
- Thanks! The text is messing up the regex. Hopefully they're manageably low in numbers for manual corrections! -- DaxServer (talk) 10:51, 8 August 2024 (UTC)
- They do seem rare: I found them looking for "please" in the source [7] (a few other image titles and other notes remain). Enhancing999 (talk) 10:54, 8 August 2024 (UTC)
- The notices seem mostly gone by now: thanks for that! Creator template could still be added to ca. 8700 files. Do you plan to do that? Enhancing999 (talk) 11:00, 10 August 2024 (UTC)
- Yup, I'll do that after the weekend ;) -- DaxServer (talk) 11:22, 10 August 2024 (UTC)
- @Enhancing999 I started it now -- DaxServer (talk) 10:16, 16 August 2024 (UTC)
- Yup, I'll do that after the weekend ;) -- DaxServer (talk) 11:22, 10 August 2024 (UTC)
BotRequest
[edit]Re. -"I, "
Hi Dax, thank you for the message that you want to do the cleanup of these typo's, which seem to be leftovers from image transfers from wikipedia's to commons around 2007/2008 ... I think we can savely ignore a lot of the discussion that really belongs in the template discussion page, where all the confusion and messing comes from in the fist place. Because this case is just about fixing a bunch of typo's not about new legalistic approaches or statements. But since it in the official license chapter, I won't touch this on this massive scale, 23k items. If you agree, I would be happy to see these typo's fixed. Thanks Peli (talk) 22:57, 12 August 2024 (UTC)
- Hi @Pelikana I do wish there is a consensus but it seems we have to wait a bit longer than anticipated! -- DaxServer (talk) 08:40, 14 August 2024 (UTC)
- Hello, yes I see. And now I also doubt a bit about touching these PD licenses at all, (where the authors in later days probably would have used cc-by-sa) It's a small typo error, that i can live with, and would be just a cosmetic fix after all. Peli (talk) 08:48, 14 August 2024 (UTC)
File:Marlboro Advance Perforated Holes Filter (India).png
[edit]- This file is a copyright violation because it comes from: https://www.reddit.com/r/todayilearned/comments/evmn66/til_that_light_cigarettes_are_designed_to_fool
The said reddit post has a URL of the Wiki page on which this image was placed, and reddit crawled wiki to read that image, not vice-versa.
Request you to please indulge in deletion with diligence so that to avoid deleting valid images and wasting time of editors in restoring them. Thank you, User4edits (talk) 05:19, 1 September 2024 (UTC)
Congratulations! It has bot status now. EugeneZelenko (talk) 14:27, 2 September 2024 (UTC)
- Thank you! -- DaxServer (talk) 14:54, 2 September 2024 (UTC)
![]() |
Kumar Gandharva Shankara has been listed at Commons:Categories for discussion so that the community can discuss ways in which it should be changed. We would appreciate it if you could go to voice your opinion about this at its entry. If you created this category, please note that the fact that it has been proposed for discussion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it. If the category is up for deletion because it has been superseded, consider the notion that although the category may be deleted, your hard work (which we all greatly appreciate) lives on in the new category. In all cases, please do not take the category discussion personally. It is never intended as such. Thank you! |
∞∞ Enhancing999 (talk) 19:42, 2 September 2024 (UTC)
Question
[edit]Hello Dax! Below I have a question/request, but I would like to know if is even feasible to be done:
Brazil's Superior Electoral Court has an enormous CC-4.0 electoral database of candidates portraits for the Voting Machine dating from 2004 till now. @Pfcab: has made an enormous job of uploading a large chunk (as of right now, there are 14 736 files in Category:Files from Portal de Dados Abertos do TSE, but there's still a lot more to go). The JPEG database has images of candidates for Presidency, Governor, Senators, Deputies (State and Federal level), Mayor, Council people, Vice-Presidency, Vice-Governor and Vice-Mayor. This deletion request closed as keep, while considering that those images are in scope for Commons.
I wonder: it would be possible for a/your bot to upload these files while naming each one like File:2020 LUIZ MARINHO CANDIDATO PREFEITO SP SAO BERNARDO DO CAMPO TSE (250000897682).jpg (YEAR - CANDIDATE - STATE - CITY - TSE - IMAGE ID)? @Sailoratlantis: did a long exposition about this database here.
If the bot can't work with the .Zip files, the site divulgacandcontas.tse.jus.br has the biographical material (see here for example). Considering that the images are from the same database, the bot could recognize links like this one (23,7 kb jpg), the name of the candidate, title it as the example above, categorize it in the subcategories of Category:TSE electoral portraits by year and maybe in the subcategories of Category:Politicians of Brazil by party?
As a side request: there are several images uploaded with several styles of naming (as you may see here). The bot could fix and standardize everything, following the example above (created by Pfcab)? It would also be good to look for uploaded files from the same database, but that are missing the Template:TSE-Dados-Abertos and the category by year.
I don't know more about the technical stuff, but I hope that it was all useful. I only didn't make a work request because it seemed all of more complex than the Commons seems to allow. Thank you very much,
Erick Soares3 (talk) 23:24, 3 September 2024 (UTC)
- On the Council people, deputies and senators, to avoid a bloated upload, I'm in favor of only uploading the elected people (the .CSV files from the Portal have all the data). Erick Soares3 (talk) 23:44, 3 September 2024 (UTC)
- Hi @Erick Soares3 Thanks for asking me. Let me look into that and I'll get back to you with my opinions -- DaxServer (talk) 09:09, 4 September 2024 (UTC)
- @Erick Soares3 Here're my opinions, once the DR is closed and decided that the files are properly licensed.
- The naming format is very much possible and easy as the info exists in the CSV files in the ZIP. It is also possible to categorize under TSE portraits by year and also under the by-party categories. I don't speak Portuguese, so it is not immediately clear which columns in which CSV refer to the political affiliation, but I assume the information is somewhere over there. If all the information provided in the divulgacandcontas.tse.jus.br portal exists in the dataset ZIPs, then it is much easier as we don't need to collect the information from that portal's API rather just download bunch of ZIP files and work on them. If not, there would be some sort of research required into their API and understand what is what.
- For the existing files, standardising the naming format, templating like the TSE template, and categorizing is also possible once the information is collated as above, surely some sort of reconciling needs to be done one way or the other. I think it's better to do the renaming and updating after organizing the info so as to avoid any double work. Looking at that category, there seem to be colorization like this of the original - I guess one of the questions that need some answers, but these will come up once the work is started.
- The dataset has a ton of biographical information about the candidates. Most of that belongs in Wikidata. So, I see this to be a cross-wiki project of very good value - where the uploads go here and bios go in Wikidata and are linked in the SDC. I'm not sure if there is an existing bot that is already working on this data Commons and/or Wikidata, but if not, it shouldn't be much of a hassle once someone takes up on the work.
- All the images of candidates for all the offices stated above are in scope for Commons. I'd upload all, and not just the elected ones. I'd recommend posting this request at Commons:Batch uploading so that others can chime in as well. Do you know how much of the dataset @Pfcab is working on to upload? Just wondering if they planned to do all, then they might have already finished before I or someone else start working on your request. I am interested as well, altho I can only work as the time permits and help is provided with Portuguese. I'd also recommend posting at Wikidata, maybe the Project Chat, about the project and ask for help/opinions/comments. If you need any sort of help from me, feel free to ask.
- Good luck! -- DaxServer (talk) 10:52, 4 September 2024 (UTC)