Home ยป Forum ยป Bug Report and Feature Requests

Forum: Bug Report and Feature Requests

Mdash replacement in HTML downloads

Quasirandom ๐Ÿšซ
Updated:

I'm assuming there's a technical need to keep the character conversion to low ASCII characters in HTML/Kindle/ePub file downloads.

That being the case, I can live with curly quotes becoming straight. But is it possible to convert mdashes to two hyphens instead of one? It gets really confusing, especially in sentences with both a hyphenated word and a dash.

ETA: Mentioning HTML was a careless error: I meant the HTML-based formats of ePub and Mobi, not web pages in a browser.

Ernest Bywater ๐Ÿšซ

@Quasirandom

I don't use an mdash in my stories as I find using a 'space hyphen space' does the same job perfectly and it translates easily into any format you want. Having the space on either side makes it clear you're not hyphenating the words.

Switch Blayde ๐Ÿšซ
Updated:

@Quasirandom

I'm assuming there's a technical need to keep the character conversion to low ASCII characters in HTML/Kindle/ePub file downloads.

I use m-dashes in my stories and novels. They aren't converted to ASCII. They display as an m-dash (โ€”).

ETA: Check out Chapter 1 in my novel "High School Massacre." I have m-dashes in the first paragraph after the section break. You can see it on Bookapy by using the "read sample" โ€” https://bookapy.com/sample/238 ) or on SOL https://storiesonline.net/s/23609:234632/chapter-1-high-school-massacre

Replies:   Quasirandom
Quasirandom ๐Ÿšซ

@Switch Blayde

When read through the site using a browser, there is indeed an m-dash there. But when I download an EPUB file of the story using SOL's feature, it's converted to a hyphen.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Quasirandom

This will depend on the convertor software and what code was used for the mdash in the original. Epub is a cutback type of xhtml code and it does not recognise all the html codes, so some will make it through a conversion to epub while other don't. The last time I checked there was a number of ways of displaying an mdash in html and I'm not sure which is the xhtml version accepted by the epub convertors.

Replies:   Quasirandom
Quasirandom ๐Ÿšซ

@Ernest Bywater

I do ePub conversions all the time, and none of the tools (and none of the readers I've tested with) have ever had a problem with any of the standard HTML entities like quotes and dashes as long as the character encoding is correctly declared โ€” and even then, problem characters just don't render, rather than get replaced with an equivalent lower ASCII version.

I've no idea how this site's ePeb distiller actually works, mind. Developers can do some crazy things.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Quasirandom

Before I settled on the amount of html code, style code, and Calibre for creating epubs I used a range of software programs to make epubs and a range of readers and viewers to examine the results. In the process I found out not every epub converter gave the same results, in fact, most gave different results for some reason I could never fathom. The same was true of the epub readers / viewers. I also learned that you could get some real weird results depending on the font you were viewing the epub in. Now this was about a decade ago, so things may be different, but I now have a system in place that works perfectly and gives the exact results I want.

Mind you, I've found it easier not to use the special character coding because many of the problems I saw back then resulted from the systems not liking the special character coding. By avoiding anything that neds a special character code that starts with & and ends with ; I also minimise the amount of work to do when converting from the word processor to the html, epub, and pdf files I end up with.

Lazeez Jiddan (Webmaster)

@Quasirandom

The archive creator does not straighten quotes nor changes the text.

However, text posted before 2016 was that way on the site, and I didn't go through it to 'smarten' the quotes.

Recent posts reflect what the system does currently.

I double checked an archive for text posted today and in both, the plain text version and html versions, the quotes were curly and the dashes were n-dashes as the author posted them.

Quasirandom ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

The downloaded TXT file does indeed preserve curly quotes and m-dashes. The downloaded EPUB and Mobi files do not, for me. (Tested on iOS and Windows 10, using downloads via a couple browsers, using a couple recent stories.) If that's not expected behavior, then I think I have a bug-fix request instead of an enhancement request.

Lazeez Jiddan (Webmaster)

@Quasirandom

I checked and yes, you're right, in the EPUB and by extension .mobi files, the quotes are straight and I traced it back to the utility that I use to validate the xhtml code before it's added to the EPUB.

I could find no way to use that utility without it straightening out the quotes.

I can't risk not using the utility or I might end up with a bunch of invalid EPUB files.

I'll see if I can figure out a way to 're-smarten' the quotes after the validation.

Switch Blayde ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

the quotes are straight and I traced it back to the utility that I use to validate the xhtml code

What about the m-dash?

Quasirandom ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

Thanks for checking.

I've done enough playing around with regular expressions for smartening quotes to know I for one would be really leery of a programmatic way of doing it โ€” there's so many use cases that I have to evaluate manually when I'm converting a client's file. ("Is that ' at the start of a word a begin quote or apostrophe?" is just the start of the complications.)

What about the m-dashes?

(Though, how would you distinguish one from a hyphen? Dang...)

bk69 ๐Ÿšซ

@Quasirandom

Though, how would you distinguish one from a hyphen?

check for space. a hyphen will be preceded and followed directly by a visible letter, a dash will not.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@bk69

check for space. a hyphen will be preceded and followed directly by a visible letter, a dash will not.

There are no spaces around an m-dash.

Now sometimes on webpages they put spaces around them, but not in novels (and therefore ebooks).

Lazeez Jiddan (Webmaster)

@Quasirandom

I've done enough playing around with regular expressions for smartening quotes to know I for one would be really leery of a programmatic way of doing it โ€” there's so many use cases that I have to evaluate manually when I'm converting a client's file. ("Is that ' at the start of a word a begin quote or apostrophe?" is just the start of the complications.)

Issue solved. Whatever appears on the site will now appear in EPUB files.

Replies:   Quasirandom
Quasirandom ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

Spot check confirms this. Thank you!

awnlee jawking ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

the dashes were n-dashes as the author posted them.

I may be missing something, but aren't m-dashes and n-dashes different animals?

AJ

Replies:   Keet  Switch Blayde
Keet ๐Ÿšซ

@awnlee jawking

I may be missing something, but aren't m-dashes and n-dashes different animals?

yep: https://www.punctuationmatters.com/hyphen-dash-n-dash-and-m-dash/

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Keet

yep: https://www.punctuationmatters.com/hyphen-dash-n-dash-and-m-dash/

That was a really good article on the dashes. I never knew about the m-dash for missing letters and don't quite understand when you would have it.

But he's wrong about how Word automatically generates the dashes. It must be how his autocorrect is set up.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Switch Blayde

I never knew about the m-dash for missing letters and don't quite understand when you would have it.

That article is the only place I've seen that usage for an mdash mentioned, so it can't be that common. Everyone of the usages listed can be accomplished by using either commas (for extra info) or a hyphen with spaces.

One problem I've seen with the use of the mdash is it doesn't have enough white space around it and some fonts make it look like a hyphen due to the lack of white space, so I avoid using it.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Ernest Bywater

One problem I've seen with the use of the mdash is it doesn't have enough white space around it and some fonts make it look like a hyphen due to the lack of white space,

it's so much longer than the hyphen. It jumps out when I read it.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Switch Blayde

it's so much longer than the hyphen. It jumps out when I read it.

Not when it fills all the space between the letters on each side of it in the smaller fonts and font sizes. Since I don't know what font the reader will be using I avoid anything that may cause issues with any font.

I guess the fact I grew up in a family with adults who had eyesight issues makes me very aware of the problems they can have with text that I make a point to ensure such problems don't occur in my writing.

Switch Blayde ๐Ÿšซ

@awnlee jawking

I may be missing something, but aren't m-dashes and n-dashes different animals?

I assumed he found one with an n-dash and it did not convert to a hyphen. Or that it was a typo.

Back to Top

 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.


Log In