Please read. Significant change on the site that will affect compatibility [ Dismiss ]
Home ยป Forum ยป Author Hangout

Forum: Author Hangout

Punctuation Symbols

Crumbly Writer ๐Ÿšซ

In another thread, I wanted to address punctuation and wondered how to do it (other than looking it up in an online dictionary and copying and pasting it). Is there some html coding format (or document type) that allows you to do pronunciation symbols (I've used them in stories before, but never focused one where it come from).

I presume it's like most foreign-language accent marks, there ARE no systematic html codes, you simply have to specify the language, which can screw up an eBooks layout when the fonts no longer match.

Keet ๐Ÿšซ
Updated:

@Crumbly Writer

As far as I know you have to use the lang= :
< span lang="ja">ๅคงๅˆ€< /span>

... which apparently doesn't work on the SOL forum but that could be because the span tags are not valid. As we know valid tags are removed by the forum software.
I use it in one of my appendices with descriptions and images of Japanese weapons where it does work as intended with charset UTF8 set in the header.

ETA
A lot of characters do have an html code. Here's a nice table with most of the codes: https://dev.w3.org/html5/html-author/charref.
Problems with windows-1252 vs UTF-8?: http://www.i18nqa.com/debug/utf8-debug.html
For specific symbols like Greek or mathematical: http://www.evotech.net/blog/2007/04/named-html-entities-in-numeric-order/ (old but still valid).
And here's an article how writers can 'work around' the problem: https://writersedit.com/fiction-writing/5-ways-bring-multiple-languages-fantasy-novel/

Switch Blayde ๐Ÿšซ

@Crumbly Writer

I found this doing a Google search:

How do I get phonetic symbols on my Mac?

To enable IPA, click the Gear in the top left corner of that window, then "Customize List", then scroll down and check the box for "Phonetic Alphabet". Now, you can use that symbol picker menu to insert IPA by clicking "Phonetic Alphabet" and double-clicking the character you'd like.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Switch Blayde

To enable IPA, click the Gear in the top left corner of that window, then "Customize List", then scroll down and check the box for "Phonetic Alphabet". Now, you can use that symbol picker menu to insert IPA by clicking "Phonetic Alphabet" and double-clicking the character you'd like.

I'm unsure how your accessing the "gear" in the top left corner, but neither WORD, the Mac, nor ANY of my html programs seem to off a "Customize List" option. :(

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Crumbly Writer

I'm unsure how your accessing the "gear" in the top left corner, but neither WORD, the Mac, nor ANY of my html programs seem to off a "Customize List" option. :(

I didn't understand it. I copied it as it was written. I hoped someone was smarter than me.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Switch Blayde

I didn't understand it. I copied it as it was written. I hoped someone was smarter than me.

No, I don't understand what application/program you're using that has a 'gear' symbol in the upper left window corner. WORD doesn't, nor does Finder or my browser windows.

Ahh, I understand now, you copied the 'how to' directly from the website, so chances are, the advice is SO old, that the underlying facility no longer exists on the Mac (or, more likely, the method of access has changed since them).

I'll keep looking. If nothing else, now that I know which character I'm looking for, I can ask in more knowledgeable writing forum how to find, format and utilize the specific punctuation characters.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Crumbly Writer

Try what it says in this article by UC San Diego.
https://wstyler.ucsd.edu/posts/ipa_with_osx.html

Article begins with:

As a linguist, you find yourself using the International Phonetic Alphabet (IPA) incredibly frequently. Some of the characters are easy enough to use without any special work (ล‹, ษ™), as most fonts already include them. However, to get the more cool/obscure characters and diacritics, or to stack diacritics (placing, for instance, a tone marking above a nasal marking), you need special fonts, layouts and setup. In this post, I'm going to explain, as simply as possible, how to go about finding the files and setting this up, all without paying a dime for specialty software.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Switch Blayde

Thanks again, Switch, as 'stacking diacritics' is what I was referencing before, where the punctuation symbol is added as a 'backup and insert' character on the previous ASCII character.

Note: The IPA font does not list the specific symbols that you'll need, so you'll first have to Google the proper punctuation symbols, and then try U+0250 - U+02AF to see which one matches it.

Keet ๐Ÿšซ

@Crumbly Writer

There are other advantages gained by using the 'lang' attribute. It the primary indicator for screen readers to know how to pronounce a word or sentence if the correct language is indicated. A good reason to use it even when you don't need to display the characters correctly.
Another advantage is that you can use it as a CSS selector: lang="es" in your html and in your CSS :lang(es) { font-style:italic; }. A nice way to auto-italicize those words for example. (In most browsers you can even abuse this by using a non-existent language code to enable usage in CSS.)

By-the-way, you should probably use both lang and xml:lang (https://www.w3.org/TR/i18n-html-tech-lang/#indoclang).
Use language code 'zxx' for non-existent languages, i.e. Klingon ;)

bk69 ๐Ÿšซ

@Keet

Use language code 'zxx' for non-existent languages, i.e. Klingon ;)

By 'non-existent' you mean 'unforgivably left out of UTF-8 encoding', right?

Replies:   Keet  Crumbly Writer
Keet ๐Ÿšซ

@bk69

By 'non-existent' you mean 'unforgivably left out of UTF-8 encoding', right?

No, actually "no language" would be a better description: https://www.w3.org/International/questions/qa-no-language.

Replies:   bk69
bk69 ๐Ÿšซ

@Keet

No, actually "no language" would be a

lie?

Remember, Klingon was actually created by a linguistics professor (much like Elvish was). And he engineered a complete language. You can find texts in the language (like Hamlet) and people fluent in it. So the reasons presented for not including Klingon in the specifications originally are no longer valid.

Replies:   Dominions Son  Keet
Dominions Son ๐Ÿšซ

@bk69

You can find texts in the language (like Hamlet)

Haven't seen a copy of it, but I have read that there is a Klingon translation of The Bible.

Keet ๐Ÿšซ

@bk69

lie?

I was referring to the zxx code, not the Klingon language.

Crumbly Writer ๐Ÿšซ

@bk69

By 'non-existent' you mean 'unforgivably left out of UTF-8 encoding', right?

You can often get around those limitations, but if you're only talking about a few sentences, switching to UTF-16 can dramatically increase the size and reading speed (download times or paging time in eBooks) of the story, especially for those of us writing full novels or 'Epic' stories.

So, except for a few specific cases, changing the charset declaration is also not an optimal solution.

Replies:   Keet
Keet ๐Ÿšซ

@Crumbly Writer

switching to UTF-16 can dramatically increase the size

Since UTF8 is 1-byte vs. UTF16 2 bytes the file size will roughly double. You are very right by not using it unless specifically required for some characters.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Keet

Since UTF8 is 1-byte vs. UTF16 2 bytes the file size will roughly double. You are very right by not using it unless specifically required for some characters.

Especially since many of the Asian languages (ex: Vietnamese, though not Chinese) have a Roman character version for printing in non-16 bit documents.

StarFleet Carl ๐Ÿšซ

@Keet

non-existent languages, i.e. Klingon

Klingon exists. It has it's own dictionary, even.

Ka' Plah!

Crumbly Writer ๐Ÿšซ

@Keet

There are other advantages gained by using the 'lang' attribute. It the primary indicator for screen readers to know how to pronounce a word or sentence if the correct language is indicated. A good reason to use it even when you don't need to display the characters correctly.

As Switch noted, whatever I use in my printed books, SOL won't support ANY form of the 'language' tag. What's more, as I pointed out myself, changing languages mid-paragraph often produces notable discrepancies in spacing, kerning, size and the shape of characters. So, it's a 'workaround', but it's hardly an optimal workaround. In my recent book, I decided to simply toss out ANY extraneous national accent marks entirely, due to the inability to render them accurately across various outlets.

However, by accessing the specific phonic character set, I was hoping to implement the 'back-space' font characters, which essentially overlay your regular alphabetic font with the necessary pronunciation marks. (I've seen it used in the past, I just can't recall how to access it.)

Replies:   Keet
Keet ๐Ÿšซ

@Crumbly Writer

SOL won't support ANY form of the 'language' tag

Ah, I didn't get that SOL doesn't support it. That leaves very little else to do other than using one of the workarounds mentioned in one of the links I suggested. Not 5 minutes ago I suggested the lang attribute option in an email to Reluctant_Sir. I will have to correct that.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Keet

Ah, I didn't get that SOL doesn't support it. That leaves very little else to do other than using one of the workarounds mentioned in one of the links I suggested. Not 5 minutes ago I suggested the lang attribute option in an email to Reluctant_Sir. I will have to correct that.

No, the language specification is handy, especially when displaying 16-bit Asian characters, but it's difficult to implement and SOL still doesn't support it (SOL strips out ALL but the most basic of character attributes), but as I note further down, they DO support the & #4-digit-code; Unicode formatting, but again, ONLY for html and ePubs.

palamedes ๐Ÿšซ

There are colleges that offer Klingon as a class course. Even though the Klingon language is made up for Movies and TV learning the language teaches and uses the same fundamentals that are used in learning past dead languages or at least that is what the college syllabus states.

Ernest Bywater ๐Ÿšซ

@Crumbly Writer

Try this one

https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Crumbly Writer ๐Ÿšซ
Updated:

@Ernest Bywater

https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

I've actually used Unicode for situations like this in the past, though I'd forgotten how to access it. However, the included list of Unicode codes only addresses the exact same characters offered in the basic html code (ex: & code;) commands. To load the types of characters that I'm looking for, would once again require mixing and matching different languages, which each have a different look and feel, not only limiting where I can use it (i.e. NOT on SOL), it can also screw up the look and feel of the text.

The standard tables feature mostly European languages, while UTF-16 covers many of the Asian countries, but in my most recent story, with a Roma (gypsy) character, it's simply not covered. (Try searching for Dascalu to see one of the characters I'm looking for, basically an upside down cap).

Note: See the 'updated' response below, once I found the specific character that I was looking for (it was way, way, way down the list, buried amongst all the other characters, so I had to search for the specific symbol codenames.

Crumbly Writer ๐Ÿšซ

@Ernest Bywater

Thanks, this helps. I'd used Unicode characters before, but couldn't remember how to access the codes or even what they were called. It look a while to identify the correct character for the Romanian (Roma) character in my newest story, but finding it allows me to include the character in whatever default language I'm using. And by creating it in my html document, I can copy and paste it in my original Word document (though it no longer retains the underlying unicode characters, meaning it's questionable whether it'll appear correctly with the many 'distributors' that I use for my stories (Amazon's generally ok, but SW and others are a bit more iffy).

Now I just have to double back and figure out the pronunciation marks, so readers (and my characters) will have some idea of how the name is pronounced.

Replies:   Keet
Keet ๐Ÿšซ

@Crumbly Writer

Now I just have to double back and figure out the pronunciation marks, so readers (and my characters) will have some idea of how the name is pronounced.

I wouldn't bother if I were you because there are less people who know how to interpret pronunciation marks than the number of people actually understanding the word you want to explain. You see them a lot on Wikipedia and unless you studied how it works they make very little sense. They are useful for the html 'speak' function supported by most browsers but you won't see those in readable text. What better works for almost any reader is splitting the word into syllables that represent the sounding of the word. I have seen some authors using that like "calliente (pronounced as ka-li-ente)".

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ
Updated:

@Keet

I wouldn't bother if I were you because there are less people who know how to interpret pronunciation marks than the number of people actually understanding the word you want to explain.

The key isn't who understands the symbols, but that curious readers can look up how the name is pronounced. The vast majority of readers will never bother. Personally, I'll have to double back, as I deleted the original story explanation, but the specific character I'm looking for has a 'sounds like' pronunciation to it.

As for putting the pronunciation is brackets (which we all know are unsupported in literature), it's easier to drop it into dialogue as the word/name/phrase is first spoken, as that's a more natural usage and doesn't requite a personal 'note' to readers (which breaks the proverbial 3rd-wall of fiction).

Update: Turns out, the Romanian a-breve character (apparently each language pronounces it slightly differently) I'm using is pronounced just like the soft-a in "a-bout", so it doesn't actually require an extraneous pronunciation character (and I could actually leave it off entirely, as I was doing before, as it's pronounced exactly as it looks).

Updated Update: Turns out, the Romanian name is pronounced "Das-ka-loo" with a short a-sound but a hard c sound, which doesn't actually sound like anything in standard English ("das-ka-loo my darling?"). Also, according to HowToPronouce.com, the first syllable can be pronounced with either a hard or soft a sound, surrounded by a hard D and C, so "DaS-Ka-loo".

This is clearly a case where too much research does NOT help a story! ;)

Replies:   Keet
Keet ๐Ÿšซ

@Crumbly Writer

The key isn't who understands the symbols, but that curious readers can look up how the name is pronounced. The vast majority of readers will never bother. Personally, I'll have to double back, as I deleted the original story explanation, but the specific character I'm looking for has a 'sounds like' pronunciation to it.

If you really want to show words with phonetics, most of the used symbols do have a 4-digit html code:
https://sites.psu.edu/symbolcodes/ipachart/
https://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm#numbers
One of the most extended unicode tables: https://www.codetable.net/unicodecharacters.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Keet

If you really want to show words with phonetics, most of the used symbols do have a 4-digit html code:

Thanks, Keet. It's nice actually SEEING the symbols rather than seeing a list of numeric ranges. But, it turns out the stacking accent marks are all in the decimal later 700 range, which is useful knowing just in case you ever need one (ex: you can't find the specific character you want).

Mushroom ๐Ÿšซ

If you want to really make people wonder, throw in the interrobang.

Replies:   Crumbly Writer
Crumbly Writer ๐Ÿšซ

@Mushroom

If you want to really make people wonder, throw in the interrobang.

You'd actually be surprised how many recognize and understand the interrobang (it's usually what most teachers use to captivate their 'better' students, since they'll rarely mention it to (and confuse) their hoi polloi students. ;)

Back to Top

Close
 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.


Log In