Please read. Significant change on the site that will affect compatibility [ Dismiss ]
Home ยป Forum ยป Bug Report and Feature Requests

Forum: Bug Report and Feature Requests

Story text parsing possible bugs

Quasirandom ๐Ÿšซ
Updated:

I've seen a couple unexpected results in stories I've posted, I think due to the automated text processing. I post as HTML with all quotation marks already converted to curly.

1. If a quotation ends with a close-italic tag (as in "He said what?") the right double quote gets turned into a left double quote. This is clearly wrong and looks like a dumb typo.

2. If a work begins with an apostrophe (right single quote) (as in 'em for them or 'cause for because) the apostrophe is turned into a left single quote. This is wrong but since most autocorrecting word processors also miscorrect that to a left single quote, it just looks like bad proofreading.

3. The string of space+ellipsis+questionmark gets turned into ellipsis+space+questionmark. This isn't wrong, exactly, but it's unexpected.

There may be more I haven't noticed. But all three of those have happened repeatedly.

Ernest Bywater ๐Ÿšซ

@Quasirandom

I post in HTML and as long as I stay within the approved HTML code the text comes out the same as input it, with only a few exceptions where there was an issue with the Wizard, but they've been rare and easily fixed by reposting the same file.

In the past a few other authors have had issue with HTML conversion in the Wizard, and each case I know of it's been due to them using code that isn't in the accepted list. In two cases it was because the code provided when MS Word converted some text it wasn't standard basic HTML code.

I use curly double quotes for the dialogue and straight single quotes for the contractions, and they all come out OK.

One thing I do have to do sometimes between the conversion to HTML and what I send to SoL is to do global changes for any cases where the word processor conversion inserted HTML code that used the ampersand symbol such as (with blanks inserted for safety) & a m p ; or & q u o t ; or & l t ; and similar code I replace them with the standard text characters I want displayed. I don't know if they cause an issue with the SoL Wizard or not, but I don't take the risk and I have the exact characters in the etxt, even when using and accent or the like in the text. That way what appears between the < p > and the < / p > is the exact characters I want to display, and they do.

Replies:   Quasirandom
Quasirandom ๐Ÿšซ
Updated:

@Ernest Bywater

I hand-code my HTML and limit it to the listed acceptable codes. (I would never, ever use Word. I've spent too damn much of my time cleaning up for ebooks my clients' HTML code that came out of Word. Billable time, but still the ughhhhhh.) I don't use character entities โ€” those & codes โ€” but the characters themselves. And I use curly quotes, both single and double.

I just checked the upload file for my latest story against what's on SOL. Every instance of a < /i > followed by a right double quote, the latter was changed to a left double quote. Every apostrophe at the start of a word was converted to a left single quote.

Something transformed the text between the upload and the page being served, and in suboptimal ways.

Switch Blayde ๐Ÿšซ

@Quasirandom

Something transformed the text

The Wizard does transform text.

I submit my docx without curly quotes and the Wizard changes them to curly quotes. I actually like it doing that. That's a good transformation.

It didn't always do that. I did a search with Ma/Fa and sorted by date (ascending) to get the oldest story. It doesn't have curly quotes.

The other stuff sounds like bugs in the transformation process.

Ernest Bywater ๐Ÿšซ

@Quasirandom

What you say is very interesting as I use single and double curly quotes with and without italics, yet I've not had a problem with them since Lazeez changed the code to accept the curly quotes. Many years ago it used to have an issue with them, but hasn't for some years now.

I know the SoL Wizard only accepts a limited set of basic HTML codes, and they nearly all relate to formatting instructions.

The character format codes that I know are accepted are: i b em strong sup hr

The paragraph format codes that I know are accepted are: h3 h4 h5 p blockquote img span

br is also accepted but it's best to warn them in the moderator notes

the codes for text alignment and some font colors are also accepted and are best handled through css code in the file and span commands

Quasirandom ๐Ÿšซ

@Ernest Bywater

I've been avoiding spans and using css (or style attributes) because they're not listed in the instructions. Good to know they work.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Quasirandom

spans and using css

I use a very limited css and this is it - with spaces for safety against running as code.

. c { text-align: center }
. red { color: red }
. blue { color: navy }
. green { color: green }

and how I use them in a SoL story file

< p class= "c" >< img src= "https://res.wlpc.com/img/ernestbywater/playball/playball-cover.jpg" alt= "Cover - Baseball field with a baseball in the air" width="580" height="860" >< /p >

< h5 class= "c" >Cover Art< /h5 >

< p >The images are < i >BaseballStadium.jpg< /i > by Bspanberg and
< i >Baseball.png< /i > by Tage Olsin are used with their
permission under Creative Commons Attribution. The
cropping, size adjustment, and text are by Ernest Bywater. All rights
to the cover images are reserved by the copyright owners.< /p >

< p class= "c" >7 April 2021 version< /p >

< p >< span class= "blue" >< i >< b >Note:< /b > UK English is used in
this story, except for dialogue by a US character where US English is
used in the dialogue and some nouns.< /i >< /span >< /p >

< p class= "c" >__________________________________< /p >

< p class= "c" >
< span class= "green" >Chapter 01< /span >< br >
< span class= "blue" >< i >First Game Back< /i >< /span >< br >
< span class= "red" >Epilogue< /span >< /p >

As you can see, it's a very limited css list, but it does make using those items a lot easier. Naturally, what's in the css is limited to what the Wizard already allows as HTML code options. I don't use the span command for the bold or italics as the Wizard will use them as is.

I could also use the left or right paragraph alignments by including them the same way as I do the center, if I wanted to, but I don't.

richardshagrin ๐Ÿšซ

@Ernest Bywater

codes that I know are accepted are: i b em strong sup hr

IBM strong supporter.

Replies:   Michael Loucks
Michael Loucks ๐Ÿšซ

@richardshagrin

Then, when formatting the text into html, the quotes are 're-educated'.

Unfortunately, this often results in the 'wrong' quote being used in my stories. It's especially true when I use single quotes for 'internal' quotes (i.e. a quote in a snippet of dialog).

Could we have an option to not use 'smart' quotes for our story displays? I don't ever use typographer's quotes (and use BBEdit's 'straighten quotes' feature to make sure).

Replies:   Quasirandom
Quasirandom ๐Ÿšซ

@Michael Loucks

Or an option to not adjust quotes at all?

Lazeez Jiddan (Webmaster)
Updated:

@Quasirandom

Due to the variety of submissions that we get, which is as variable as there are authors, we run clean-up scripts.

The clean-up script straightens the text's quotes among other things like fixing obvious punctuation errors.

Then, when formatting the text into html, the quotes are 're-educated'.

When a quote is adjacent to a curly bracket like '{ i} the algorithm for curling the quotes gets confused in some cases it seems.

Replies:   Quasirandom
Quasirandom ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

You straighten curly quotes, then re-curl them? That seems ... dicey. (That said, I can only imagine the variety you're seeing submitted.)

I don't know of a way to programmatically distinguish between a opening single-quote and a apostrophe at the start of a word, unless it's already curled correctly. (I've tried and failed to find one.) The curling near end-italics-tag looks like a clear bug, though. If there's no spaces on either side of the end-italics-tag, I can't think of a use-case where left-curl is correct.

Back to Top

Close
 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.


Log In