System names generation - Italian variant

WKFO
Posts: 200
Joined: Wed Jun 14, 2017 12:37 pm
Location: Turkey
Contact:

Re: System names generation - Italian variant

Post by WKFO »

Could help with Turkish if you are interested.
testadilegno
Posts: 26
Joined: Sun Feb 02, 2014 8:54 pm
Location: Milano, Italy
Contact:

Re: System names generation - Italian variant

Post by testadilegno »

Great :) so the main steps are:
  • Get a list of cities/towns in your language. For example I got the Italian and French ones from a site of the interior ministry of the two governments. It should be reasonably large because we need some statistics, it would be great to have at least about a thousand names.
  • Clean up the list. This is the tricky part: some names may contain geographical (valley of xxx) or other name (saint xxx in yyy). You can have a look at the names and try to figure out patterns to remove the data (we don't want to do it by hand). You can make a file, say, whenever this word is at the beginning of the list, drop it; or skip everything after this word, and so on. I can then adapt a script to do it on the full dataset, or if you want I can share my scripts so you can play with them.
  • Split into syllables: this is a pain to do by hand, and hard to program even with a grammar handbook on your desk. I found some resources online that have syllabication engines for a few languages, you should look around if some exist for your chosen one.
  • Then we split the names, and make three sets: one with all the first syllables, one with the last ones, and one with the middle ones. Then we pick the top X more common ones and it's done. This again is easy and I already got all the scripts that can do this.
Let me know if there are other questions and how you'd like to proceed.
Thanks :)
WKFO
Posts: 200
Joined: Wed Jun 14, 2017 12:37 pm
Location: Turkey
Contact:

Re: System names generation - Italian variant

Post by WKFO »

I assume we don't want special characters such as Ğ, Ş, İ, ı, Ç etc. because they will make searching systems by name hell for non-natives. If so, I will change them with G, Sh, I, i, Ch... or whatever is closest.
testadilegno
Posts: 26
Joined: Sun Feb 02, 2014 8:54 pm
Location: Milano, Italy
Contact:

Re: System names generation - Italian variant

Post by testadilegno »

Honestly, I kept the French accents and "cedille" ç. If you work with utf8 files there should be no problem, so do as you seem fit (font rendering code aside). Maybe in 1200 years they will reform orthography... or not :)
impaktor
Posts: 994
Joined: Fri Dec 20, 2013 9:54 am
Location: Tellus
Contact:

Re: System names generation - Italian variant

Post by impaktor »

Question is if our font supports those characters. I guess you'll notice.
WKFO
Posts: 200
Joined: Wed Jun 14, 2017 12:37 pm
Location: Turkey
Contact:

Re: System names generation - Italian variant

Post by WKFO »

My main concern is that even if our font supports it, at some point someone with a standard US English keyboard will try to make a trade run to a system that starts with Ş for example, and they won't be able to search for it.

Maybe a drop down special characters list under the search bar? Could we use this if this happens to be a rare case? It would take nearly no space when closed.
nozmajner
Posts: 1079
Joined: Mon Jul 01, 2013 3:25 pm
Location: Budapest HU

Re: System names generation - Italian variant

Post by nozmajner »

How complex it would be to make the search function equating certain kind of letters? Like it would treat that Ğ as G for example. Hungarian names would appreciate that as well (áéíóöőűú are pretty common)
WKFO
Posts: 200
Joined: Wed Jun 14, 2017 12:37 pm
Location: Turkey
Contact:

Re: System names generation - Italian variant

Post by WKFO »

nozmajner's idea sounds more plausible
impaktor
Posts: 994
Joined: Fri Dec 20, 2013 9:54 am
Location: Tellus
Contact:

Re: System names generation - Italian variant

Post by impaktor »

This depends on what is being discussed, I've assumed this would be a mod, e.g. for italians, or <x> with propper keyboard layout.

Anything for inclusion in master would have to be typeable and prnouncable on a standard UK/US keyboard.
testadilegno
Posts: 26
Joined: Sun Feb 02, 2014 8:54 pm
Location: Milano, Italy
Contact:

Re: System names generation - Italian variant

Post by testadilegno »

I understand @impaktor 's stance. However, the naming system requires an overhaul. If the dev team feels that the little things we're working on move in the desired direction, it might be worth putting in a little effort to get this into master.

I have little experience in the c++ side, but maybe the solution discussed here:

https://www.codeproject.com/tips/131667 ... ode-string

would be helpful.

If I understand it correctly, it would convert utf8 encoded letters to a simple standard English letter. This transliteration could be used to compare the system name with the string typed in, using a standard en-us keyboard. The modifications to the search function look minimal this way.

Let me know what's your opinion on this. Regards :)
Post Reply