Forum: Author Hangout

Readership Demographics

Geek of Ages

Have any statistics been gathered (or more to the point, published) regarding the demographics of this site's readership?

I can take some guesses on authors based on what they say in their blogs, stories, and forum posts—but I'm not sure how well that translates into readership.

Ross at Play 🚫

I asked the webmaster once what proportion of readers was from North America and Canada. I just included the name Lazeez in my post knowing copies of those are extracted for him to read.
He replied that about 70% were, based on hits at the site.
If you ask here and he has them, I sure he'll give us the answers.
I don't recall what information is asked for during the registration process, but I expect he could produce an analysis of those figures fairly easily.

Replies: Dominions Son robberhands

Dominions Son 🚫

@Ross at Play

He can make a reasonable guess at where every member is located based on the primary IP they access the site from.

For paid memberships, if you know how to read the credit card numbers, the number will tell you what bank issued it and from which country.

Even if he never personally sees the CC numbers, the bank routing information on the payments when the show up in the sites accounts will likely provide the same information.

robberhands 🚫

@Ross at Play

I don't recall what information is asked for during the registration process, but I expect he could produce an analysis of those figures fairly easily.

Since it's his business I think it's reasonable to assume he already has done that. Whether he wants to publish these informations is an all together different question.

Replies: Ross at Play awnlee jawking

Ross at Play 🚫

@robberhands

Whether he wants to publish these informations is an all together different question.

Yes. I've never known him to object to being asked for something, or to have any problem just stating, "I won't do that," when he has his reasons.

awnlee jawking 🚫

@robberhands

It's Lazeez's site, but I like to think the authors, editors, reviewers and readers share in the investment of keeping in moving in a forwards direction.

If the site didn't progress in the direction he intended, eg it became a dumping ground for lowest common denominator porn stories when he wanted an image of universal genre coverage (except children's books, obviously), I wonder whether he'd seek our help and advice.

AJ

AJ

Replies: robberhands

robberhands 🚫

@awnlee jawking

It's Lazeez's site, but I like to think the authors, editors, reviewers and readers share in the investment of keeping in moving in a forwards direction.

I don't object that statement, but published statistics about SoL's customers would not only be useful to sympathetic people.

Geek of Ages

Things like Google's analytics are very good at providing a detailed demographic breakdown (into e.g. age, sex, race, class, etc.) on viewers. But I don't know if any of those analytics are being run.

So, let's do the incantation:

Lazeez, Lazeez, on the Net
Do you have an answer to our request?

(Okay, so the rhyme is a little off)

Namely, do you have more detailed demographics on the readership of the site than just country, and are you willing to share those data; or is that a no-go?

Replies: Ross at Play

Ross at Play 🚫
Updated:

@Geek of Ages

Lazeez, Lazeez, on the Net
Do you have an answer to our request?

(Okay, so the rhyme is a little off)

Lazeez, Lazeez, our esteemed webmaster
We have a question, do you have an answer?

Vincent Berg 🚫

In general, based on Lazeez's previous comments, the basic assumption is that Lazeez tries to not collect user information, I'm assuming due to member worries that some government may request who's sharing what stories. As such, he generally says he doesn't have any detailed breakdowns. While examining IP address (via Google Analyics) is helpful, remember that many either disable it entirely, or they use 'shell' IP which disguise their actual country of origin, so the information derived is dubious, at best.

Bottom line, the clear majority are North American (more U.S. than Canadian) with a general smattering of all the primary English speaking (predominantly English speaking countries). That explains why so many stories focus on American characters in American situations (i.e. at American football games, vs. European football games).

Lazeez Jiddan (Webmaster)
Updated:

I don't run any analytics packages on the site. The only stats are those that I collect directly using the site's engine.

As it's stated very clearly in the site's privacy policy, I don't gather any data on the site's readers or authors. I don't try to locate IP addresses so I don't really know where know where people are coming from.

External services like 'Alexa' tries to gather that data using tools installed on users' systems, but they're highly inaccurate.

The only certain information that I have access to is the addresses of premier members because they have to enter their addresses while subscribing (credit card companies requirement), but I never really analyzed them. Even if I did, the data would be very highly skewed to the countries that can afford to spend that kind of money and the numbers are minuscule compared to the free members that it's not really representative of the whole membership.

I don't know, maybe I'm not a good business person for not collecting that data. I just never bothered. After all, I don't run ad campaigns, SOL simply relies on word of mouth and some search engine hits. When you don't run ads, you don't need the data.

Here is the link to Alexa: https://www.alexa.com/siteinfo/storiesonline.net (it says on top that the site's metrics are estimates).

Replies: Geek of Ages red61544

Geek of Ages

@Lazeez Jiddan (Webmaster)

Cool, sounds reasonable. I'm just a curious person 😎

red61544 🚫

@Lazeez Jiddan (Webmaster)

I don't know, maybe I'm not a good business person for not collecting that data. I just never bothered.

That may be true, but you're the perfect person to be running this site. There are more than enough sites tracking every move a person makes; at least there's one that doesn't. Thank you!

Vincent Berg 🚫

It's not directly applicable, but I have run metrics on the people who've visited my site, which I used to list on both SOL and ASSTR. It's hardly a measure of ASSTR, but for a general breakdown of readership, it might be generic enough.

My scores showed readers on virtually every country of the globe, including Iran, Iraq, Saudi Arabia and China (the very ones you'd think would ban access to American 'sex' sites (back in the days when I was still posting (some sex) incest stories about harems.

However, the overriding numbers skewed about 90% American (U.S.A.). Also, though numbers wouldn't reflect anyone restricting Google Analytics (which I routinely do on my own browsers), nor anyone disguising their IP addresses (though the addresses in restricted countries tends to argue against that being a real consideration.

Make of that what you will.

I still have the analytics installed, but I haven't run any reports in ages, so I'm not sure if those figures have changed much over the years.

Ernest Bywater 🚫

Even running metrics like Google Analytics (deliberate spelling) can be misleading when people disable GA on their own system by blocking it. Also, anyone using an anonomyzer or vpn like Tunnel Bear will throw the counts out.

Joe Long 🚫

Alexa highlights South Africa but not Australia?

robberhands 🚫

How do they gather data for demographics about gender or even education level?

Replies: Geek of Ages

Geek of Ages

@robberhands

Keep in mind that Google et al. also try to track people on the individual level to some extent, and build up a portfolio identity based on places you've gone and what you've clicked on. Turns out there's a not-insignificant correlation between sites you frequent and your demographic.

And that's not even getting into text analysis...

Replies: robberhands awnlee jawking

robberhands 🚫

@Geek of Ages

Big Google is watching you - that's scary.

awnlee jawking 🚫

@Geek of Ages

So do Facebook and Twitter, even for people who don't have accounts with them. Information is power. :(

AJ

Replies: JohnBobMead

JohnBobMead 🚫

@awnlee jawking

Facebook is not as all knowing as they'd like to be. Last week they let me know that Elizabeth Mead and I became friends on Facebook three years age. She's my sister, we've been friends all my life. Which, from what I've observed, cannot be said of all siblings.

Google isn't as observant as they think, either. Given the percentage of my searches dealing with academic research, you'd think they might adjust their search algorithms for me to actually recognize W. E. Collins as a search for W. E. Collins, not we Collins. Some searches can't be done on Google (ok, not just google) because of their assumptions about what an allowable search term is, and their assumptions as to what has to be a mistyping. In academic publishing, until extremely recently, you _never_ used your full name. It was always your surname and initials. So searching W. E. Collins on Google gave me lots of Phil Collins song lyrics. On the Internet Archive, it gave me W. E. Collins, so it _can_ be done.

Replies: Switch Blayde

Switch Blayde 🚫

@JohnBobMead

So searching W. E. Collins on Google gave me lots of Phil Collins song lyrics.

What if you put "W. E. Collins" in quotes, like it's written in this sentence?

Replies: JohnBobMead Ernest Bywater Vincent Berg

JohnBobMead 🚫
Updated:

@Switch Blayde

Didn't make a difference. Based upon my experience, they don't really pay that much attention to the quote marks. "W. E. Collins" still gave me pages of Phil Collins song lyrics before anything else, their spelling correction assumptions appear to have priority. Some spellings they tell you they are searching on a correction of what you typed, but give you the option of searching on what you actually typed, they don't do that with a whole bunch of others, the change is made before it gets to the search system, they appear to have a pre-search system typo correction filter; for most searches this probably works in their favor, but there are fields of knowledge that are practically impossible to research because of it. All the major search engines seem to do this. Given the number of typos I make, I can't blame them, but they don't have a selection in their advanced search menu that allows for a character by character match with what you type bypassing their typo correction filters. Due to this, some things cannot be found out using the major search engines. A search on S. Ing. will always become a search on sing, instead of returning Saint Ignatius, for whom S. Ing. was the abbreviation used for over five hundred years of scholarly publication. I know that for certain, because I tried that search; I had to resort to searching the Internet Archive, which _does_ treat S. Ing. as S. Ing.

Google and the other major "free" search engines are good for casual searching and general knowledge, but the very things that make them good at that make them useless for in depth research of a great many fields, you have to access specialized search engines, which are generally only available to someone associated with a Research University or Major Corporation; they all charge big access fees to support themselves, while Google, etc., generate advertising revenue and revenue from their specialized services; people grouse about the information mining Google does, but that's what makes it possible for them to provide a "free" service; we're paying for it by helping them gather information to sell to businesses to help them target potential customers. The specialized search engines, with their subject specific databases, don't have secondary markets, so they have to make their money from those using them as a search engine; some of them will allow individuals to purchase access to their search engines and thus their databases, but most limit it to organizations; if nothing else, few individuals can afford what they have to charge, so for most it would not be cost effective to even provide the option of individual, non-institution accounts, it would cost them more than they'd bring in. It's not as bad as it could be, as the larger Public libraries can get access to some of them, and if you have a library account with that library system you can access them via the library web site, but if you live in a less well funded library system, access is not possible. My sister can access the Online OED via her library system, I can't via mine. We live less than a hundred miles apart. When I worked at the Chicago Public Library, they had a fee based online search service, where they would consult with the client, formulate a search using the search term thesauri for the various databases, and then go online with some of the fee-based databases to see what they could find; the cost of each search was such that they had to charge the end user for the search, but this made it possible for individuals and firms who couldn't afford the monthly subscription fees to gain access for tightly focused searches; CPL is _huge_ and could offer this service, 99% of Public Libraries can't, they don't have the staff or budget.

Access to general knowledge can be subsidized by the secondary revenue streams. Where there are no secondary revenue streams, the end user has to pay for it. Simple economics. For most of the population this isn't a problem, as they don't desire access to that specialized knowledge. I happen to want access to it, and am not associated with academia in a formal manner, so I'm SOL. I'm an economically non-viable market, and so long as I live in a free market society, that's how it will be. I don't know that it's any better in a Socialist country such as Sweden, I haven't investigated what is made available via their public libraries, if they receive enough government financial support to be able to provide access to those databases. If I became a premier member of Academia.edu I'd get access to the Online OED, that's one of their marketing points, but I'd need to know what other databases it would provide me access to before I looked into how much it would cost me each month.

But this is _very_ off topic by now. Hope I didn't bore anyone too much.

EDIT: OH My Fucking GOd, that's a whole BLOG Posting!
EDIT 2: Encyclopedia Britannica, not the Online OED, is what Academia.edu provides access to with premier membership.

Replies: Joe Long Vincent Berg JimWar

Joe Long 🚫

@JohnBobMead

But this is _very_ off topic by now. Hope I didn't bore anyone too much.

I just pretended it was a sex scene and skipped over it.

Vincent Berg 🚫

@JohnBobMead

Didn't make a difference. Based upon my experience, they don't really pay that much attention to the quote marks.

At least with Google, if they make an assumption like that, the very first line of search results you'll see a line like: "Did you mean ..." If you click "No", or click on your original search term, it will use what you asked for.

That happens to me fairly often when searching for unusual story topics.

JimWar 🚫

@JohnBobMead

Didn't make a difference. Based upon my experience, they don't really pay that much attention to the quote marks. "W. E. Collins" still gave me pages of Phil Collins song lyrics before anything else, their spelling correction assumptions appear to have priority.

I just took the "W. E. Collins" from above and put it in a google search and didn't get a single Phil Collins. In fact the search was spot on for what you would expect to get.

Ernest Bywater 🚫

@Switch Blayde

What if you put "W. E. Collins" in quotes, like it's written in this sentence?

A search like that will usually return any exact matches on W.E.Collins first then follow it with and matches on Collins with high hit rates. any on W E Collins will probably be near the end of the general hits.

Replies: JohnBobMead

JohnBobMead 🚫
Updated:

@Ernest Bywater

I'd have _sworn_ I'd tried that and it didn't work; in fact, that huge info dump I did was from that perspective. But, I just tried it, and you are right. While a couple of "we"'s slipped in, it _did_ pull up a number of W E Collins. but unless he switched from the history of early Frisia to English Ecclesiastical History, none concerning _my_ W. E. Collins in the first 200 entries, at which point there appeared the dreaded Google message saying they really weren't going to provide anything more. So I tried "S. Ing." After 28 pages, and having to confirm I wasn't a robot, they _still_ hadn't come up with Saint Ignatius. Lot's of clear misspellings of s ing, but not S. Ign. as an abbreviation for Saint Ignatius.

Edit: Just checked the Advanced Search options. You can limit by how old the information is on the Web, i.e. how long ago it was entered. You can have it search a range numbers, and add something to define what the number refers to, such as lb.. So I tried S Ing exact match, with 1500 AD - 1800 AD. Yes, the results had numbers between 1500 and 1800, and occurrences of AD. Any AD, not just AD as short for Anno Domini. The AD didn't need to be one space away from the number, just in the same paragraph. ads counted as a hit for AD. Nine pages in and only a couple of hits for pre-1800, no Saint Ignatius abbreviation. So while I was wrong about _why_ Google, etc., isn't any good for certain types of research, I was _right_ that they aren't. Bummer. I would have been _ecstatic_ to have been proven completely wrong, as it would have meant Google would meet my needs.

Vincent Berg 🚫

@Switch Blayde

What if you put "W. E. Collins" in quotes, like it's written in this sentence?

That's always how I do it, especially when searching for books. My surname shows up in relation to a LOT of different books, so the only way I can find mine is to enclose the title in quotes, and my name in a separate set of quotes.

Geek of Ages

Obviously, he should have written it to be easily skipped over, just like you can skip over having a tomato on your hamburger.

:troll:

Ernest Bywater 🚫

John,

most search engines use what's known as Boolean search parameters. They have a way of their own that usually takes a little work to master, but once mastered is very helpful.

The general rule is to look for any match to what is being asked about.

However, if you place two or more words between double apostrophes (what some call double quotes) the system will search for an exact match on what's inside the apostrophes. when you use capital letters some search engines will look for an exact match on the capitals, while some won't. by an exact match I mean if you have "W.E.Collins" with full stops after the letters it will not see "W E Collins" without the full stops as a match.

It will not associate St. or S with Saint, either.

You can use multiple instances of exact string matches so you can search on "W.E.Collins" "Saint Ignatius" to have it look for both exact strings for a match of both together. - only half a dozen hits on that.

The other most useful option is to use the minus sign. A search on Lincoln brings up results about the president, the car, and movies with the cars first and presidential related as the most common on the first page of 467,000,000 results. However, if you enter Lincoln -president there are 451,00,000 results. make it Lincoln -president -car - welder and there are 382,000,000 result with the top one being about windows. I sometimes use this method to winnow a search by just adding more items with a minus sign so they cut out the top end results I'm not interested in.

Take the name Bywater - it results in 2,340,000 hits with the top ones being about a New Orleans suburb. Make it Bywater - "New Orleans" and it drops to 4,230 results. I often do searches on names and include minus parameters relating to the social media sites, thus they have -twitter -facebook -linkedin to reduce the results a lot.

Replies: Joe Long JohnBobMead

Joe Long 🚫

@Ernest Bywater

The other most useful option is to use the minus sign.

Thanks. I did not know about this.

Replies: Ernest Bywater Geek of Ages

Ernest Bywater 🚫
Updated:

@Joe Long

Thanks. I did not know about this.

The one thing to remember with it is to have it right beside what you don't want without any space between them. So for a single word, like los you have -los but for multiple words you use the double apostrophes to make it a string -"los angeles". In these examples the first would exclude Los Angeles, Lost and anything with the first three letters of los while the second would only exclude Los Angeles.

Geek of Ages

@Joe Long

They have quite a few useful things:

https://support.google.com/websearch/answer/2466433

JohnBobMead 🚫

@Ernest Bywater

Ernest,

"W. E. Collins" and "S. Ign." were totally unrelated searches that I'd experienced recently, so conjoining them would not have helped me. I just tried "S. Ing." and it _did_ drop the punctuation. When I searched using a number range delimitation of 1500 AD to 1800 AD, 1-800 was highlighted as a match; my experience is that Google strips punctuation, and ignores it in it's results; "s-ing" and "using" (with the "sing" highlighted) were highlighted results of "S. Ing." S. Ing. was the standard abbreviation in the Catholic Church and academic circles for at least three hundred years, but you won't find it in the online abbreviation sites that Google retrieves, and you won't find it in Google's search results. The full item I was searching for was from a holding library label, "Bibl. Prag. S. J. et S. Ing.", the poster had wanted to know what that meant. Searching the whole phrase pulled up one hit in the one search engine I found anything on, and that was the book the OP had seen. S. J. is the Society of Jesus, the Jesuits; _that_ search Google returned it as the first hit. With it being a library ID label, Bibl. is Biblioteca, Library, in many variant tongues; it also expands as Bible, Bibliography, Bibliographic. Prag. is most likely the city of Prague, it was a standard abbreviation, and the other possible expansions of Prag. that I could confirm were ridiculous in context. Documenting defunct libraries is tough; while there is an extant library in Prague that was founded by the Jesuits, a rather famous library, as far as I could tell it never used that designation; that doesn't mean it didn't, because I couldn't find anything solid about what it used to be called, and it changed management when the Society of Jesus was disbanded; the current Society of Jesus is the second creation of that order, there's a period where it didn't exist. The modern library associations don't concern themselves with libraries that no longer exist, at least not on their web sites; they focus on library education and advocacy, improving the viability of modern libraries. I was trying to find documentation outside of my memory for the meaning of the various components of the library ID. Since Google, in my experience with this search, is ignoring punctuation even if it's inside the quotation marks, excluding the false positives becomes difficult; limiting by a term works if you have a lot of false hits for a term, and few positive hits with the term to be excluded, and I used -Colorado with "W. E. Collins" from the start because of a very famous Collins Drive in Colorado (Famous? I'd never heard of it, but 75% of the first page, or so it seemed, referred to it), that's where I hoped the number range delimitation in the Advanced Search form would help by allowing me to limit the results to the time period where the abbreviation was universally in use, but since Google doesn't treat the items in the range field as unitary, and considered 1,800 1-800 and 18.00 the same as 1800, and ads as a match for AD, they were all highlighted in the results, it didn't return only those items with a date in the text between 1500 AD and 1800 AD. Now, what really gets my goat, is some of the sources I was able to verify as having S. Ing. as an abbreviation for St. Ignatius, and for the other search the W. E. Collins I was looking for, are items held by Google Books. Yes, by searching Google Books specifically, they would probably have turned up. The Internet Archive, being a much more focused data set, pulled them up right away. I'm not really surprised Google didn't turn up my W. E. Collins, he's pretty obscure, but S. Ing. for Saint Ignatius isn't something I think of as obscure, not in my fields of interest. Except, Google doesn't have access to the proprietary databases which store the electronic versions of the scholarly journals, nor the proprietary databases which store the converted primary source materials for Renaissance and Early Modern history and culture, just the .pdfs they've created at Google Books, and other open depositories such as the Internet Archive. The items of my search aren't things talked about much on web sites Google can index. They show up in the text of historic documents, and modern conversations use the modern abbreviations unless directly quoting a primary source. It's just frustrating that if I'm researching a computer related term it shows up instantly, if I'm searching for anything relating to modern society it doesn't take that much to narrow down, given the right exclusions and inclusions, but historic information is so much harder to find when using non-unique search terms, since the "free" search engines focus on bringing up the most recent information first, and where possible those items closest to you geographically. Google works for general information of the modern age. For legal information you search LexisNexis, for other things ProQuest, Cengage, and a variety of others which provide access to proprietary databases, and they all charge quite hefty fees for access, such that in general you have to be associated with either a University or a major firm in the field, or have access to the more general of them through a major library.

My experience is that Google does _not_ do a character by character match of what you put inside the " ", they ignore the punctuation, and they will ignore the spacing. I just searched "S. Ing.", the first hit is "Urban Dictionary: s'ing d's"; they ignored the spacing, the capitalization, the punctuation; nothing in that entry was an exact match of my search string. They treat " " as a proximity constraint, the items have to be very close together, in that order, but since they ignore punctuation and spacing not exact matches. They do _not_ do a true Boolean search, since they _don't_ do a character by character match, including case and punctuation and spacing. Doesn't really matter, in the end, I don't think, other than .pdfs of historic books and journals, that they have access to the databases having information related to these two searches. W. E. Collins only seems to be mentioned in publications of the Cambridge Press for 1890-1893; he had a book in preparation with them that didn't get published. A couple of other mentions in 1890, since his dissertation had won a prestigious award, the revision of it was what Cambridge Press was planning to publish as part of their Cambridge Historical Essays series; a search on "Cambridge Historical Essays" on Google does bring up most of the series as .pdfs, including the ones mentioning W. E. Collins; it also brings up "Cambridge : Historical Essays" as the first hit, again ignoring punctuation. I've sent an email to the Archivist at Selwyn College, Cambridge, where he got his degree, to see if they had any further info on him, such as his full name, but haven't heard anything back, nor do I really expect to. S. Ing., at the Internet Archive, pulled up a lot of documents; the Internet Archive _does_ seem to pay attention to punctuation and spacing inside of " ", as well as being composed of a higher percentage of relevant documents. But a lot of those documents are also in Google Books, and Google should have pulled them up with a true " " treatment.

Replies: Vincent Berg Switch Blayde Ross at Play

Vincent Berg 🚫

@JohnBobMead

Uh-oh, it looks like I've been supplanted as the run-on sentence, non-breaking paragraph king!

Replies: Ernest Bywater

Ernest Bywater 🚫

@Vincent Berg

Uh-oh, it looks like I've been supplanted as the run-on sentence, non-breaking paragraph king!

nah, forum posts are a different crown to story paragraphs.

Switch Blayde 🚫

@JohnBobMead

my experience is that Google strips punctuation,

I'm not sure of this, but I think Google drops short words, like words with 3 or 4 characters. But I'm not sure.

Replies: Vincent Berg

Vincent Berg 🚫

@Switch Blayde

I'm not sure of this, but I think Google drops short words, like words with 3 or 4 characters. But I'm not sure.

They not only do that, prioritizing keywords, but worst of all, the treat the entire entry as a string of "OR" conditions, so you'll get as many hits for "the" and "author" as you do for "Collins" (meaning, when you use complete sentences, you'll get back mostly nonsensical results).

I remember searching for one of my books "Singularity", and getting back links to "The Tao of Pooh"!

Ross at Play 🚫
Updated:

@JohnBobMead

YOU CAN SEARCH FOR ANYTHING.

You can replace any special symbol with a percent sign and the number of the symbol you want. You must have seen %20 within URLs numerous times. That's because 20 is the number allocated to the space symbol.

I don't know which character set is used, but once you have the right one you'll be able to search for anything.

SORRY, THAT'S NOT WORKING

google.com converts special characters it finds in the dialog that way.
When I tested a search including %20, it converted that and sent a request including %2520 (because 25 is the number for a % sign).
Sigh!

THIS IS HOW I FOUND "W. E. COLLINS"

I put that in the dialog box, and the results were useless.
I then edited the request that had just been sent. I replaced four '+' signs with '%20' and then hit return.
The section I amended looked this "W.%20E.%20COLLINS"&oq="W.%20E.%20COLLINS".
I was using google.com.au, and the first thing it returned was details from the Australia War Memorial about a "Private W E Collins" who was awarded a military medal during the first world war.

Replies: Geek of Ages Vincent Berg JohnBobMead

Geek of Ages

@Ross at Play

ASCII: http://www.asciitable.com/mobile/

Vincent Berg 🚫

@Ross at Play

I don't know which character set is used, but once you have the right one you'll be able to search for anything.

I use it ALL the time to embed the book readers were reading when they send me email, so I know what they're referring to (on my website, not on SOL which doesn't allow html mailback commands).

JohnBobMead 🚫
Updated:

@Ross at Play

Tried it. Took me a bit to realize you started with a normal "W. E. Collins" search, but once I figured that out, it worked. for finding W. E. Collins. Not _my_ W. E. Collins, but indeed it zeroed right in on W. E. Collins.

Doing the same thing with "S.%20Ing."&oq="S.%20Ing." replacing the search command generated by "S. Ing." had the problem of too many false positives, again.

The first screen is almost all "s'ing" matches, or other results that say Google considers punctuation to be a blank space in results as well as searches. One false positive was "S***ing", as in expurgated Shitting as profanity in news reports (it was about a Sports figure), so it doesn't matter how many punctuation symbols, they get treated as one blank space. "-s,-ing" was a hit. Also clear that the full stop isn't required for a hit. Might try adding the ASCII code for full stop after the letter but before the space code to see if that would make it require a full stop.

Did pull up Todd S Ing and Andrew S Ing in the first five pages, which was an improvement, as was no sing or using as matches.

So to get rid of matches with punctuation in the positives, I'd have to do searches excluding those matches then modify the output with the ASCII character string for the punctuation character in the exclusion part of the search string? This only has a chance of working if they actually examine the non-alphanumeric characters, instead of assigning them all the value of blank space, and the search results indicate they treat all of them as if they were a blank space. Rather, they treat them as a null string, white noise, they don't give them a value at all. At least that's what it seems like to me.

Um, just checked. S. Ing., no quotes, the space gets ditched, as do the full stops. it treats it as a search on"sing", even though it shows S. Ing. as the search term. Placing" " around the search terms prevents them from being slammed together as the default actual search, but non-alphanumeric characters count as blank spaces. Modifying the search by replacing the + in the search results with %20 came up with results _identical_ to the search without the %20 character code; non-alphanumeric characters are classified as a blank space by Google's search engine. "W E Collins" and "W. E. Collins" produce identical search results.

Just spotted something. The W. E. Collins search still has we Collins matches in the first two pages of results, where it was clear from context that it _was_ the word "we" no space that triggered the hit.

I'm not sure what to think about Ernest's argument about Google only accessing web site provided metadata. I have a blog. I've never entered tags. I've never entered anything resembling metadata, to my knowledge, unless the title of the blog post would be metadata. I've been contacted by the web store I purchased a product from that I reviewed on my site; contact was via the comments section of the blog. I didn't name the tool in the post title, not did I name the company in the post title. I just tried to pull that particular post up via a web search, and failed. I _did_ link to their site for the picture of the product. However, _another_ post where I mentioned their firm _did_ turn up in a search on their name and the name of my blog. "Garrett Wade" "not so random thoughts" The title of the post is "Dandelions", so that wouldn't be the metadata indexing. Searching "lee valley" and "not so random thoughts" pulled up _two_ posts mentioning them; both searches also reported the main page of my blog. For both searches there were other posts that mentioned them that were not returned as hits. No mention of Lee Valley in either post's title. One post was returned by both searches; I'd mentioned a gardening tool they both sold. No links inside either post. So, somehow, without any metadata that I'm aware of, Google found two posts that I've made on my blog. Searching "Grandpa's weeder" "not so random thoughts" also pulled up that post; that's the tool. The other post where Garrett Wade is mentioned, searching on the tool as I labelled it along with the blog name didn't pull it up. So not everything got indexed. "clematis" "not so random thoughts" pulls up two posts, both have clematis in the post's title. There were more posts concerning clematis that didn't get pulled up; it had taken over the back yard, but I eradicated it in one season; I posted about it quite regularly for a bit. Pretty flowers, but most invasive, and it climbs over _everything_. As I said, I didn't knowingly create any metadata, I _know_ I've never used tags with any of my posts. Somehow Google has partially indexed my blog by things not mentioned in post titles. Blogger _is_ owned by Google, it's always possible they're creating metadata for the blogs on Blogger.

EDit: It's _what_ time of day? I haven't gone to bed yet, it _can't_ be 5:00 AM!

Replies: Ross at Play Vincent Berg Ernest Bywater

Ross at Play 🚫

@JohnBobMead

Doing the same thing with "S.%20Ing."&oq="S.%20Ing." replacing the search command generated by "S. Ing." had the problem of too many false positives, again.

So, if you we're capable of writing a short version :), it would be, "Getting there ... but, geez!" Right?

You might need to start using something like:
"S.%20Ing.-%33-%35"&oq="S.%20Ing.-%33-%35"
... assuming 33 and 35 are the right numbers for an apostrophe and a star.
What a pain!

Replies: Vincent Berg

Vincent Berg 🚫

@Ross at Play

You might need to start using something like:
"S.%20Ing.-%33-%35"&oq="S.%20Ing.-%33-%35"
... assuming 33 and 35 are the right numbers for an apostrophe and a star.
What a pain!

It's the same thing we face when we try to convey how to enter html commands on this forum. Instead of simply saying "…" we instead need to type (bear with me, because it takes some time to get this to come out right) "&hellip;". It's also why we type html commands as "< br>", because if we left off the space, it would try to execute the command instead or using it as text.

Vincent Berg 🚫

@JohnBobMead

So to get rid of matches with punctuation in the positives, I'd have to do searches excluding those matches then modify the output with the ASCII character string for the punctuation character in the exclusion part of the search string?

I've never had to go that far, as it seems that sticking everything within double quotes preserves the punctuation as well, though I may be mistaken, as I've never formerly tested that assumption. Thus I'll typically search for '"The Demons Within" +"V. Berg"' (though, I've always used my full name when publishing as "Berg" is too common of a name to be reliable for searching).

I'm not sure what to think about Ernest's argument about Google only accessing web site provided metadata. I have a blog. I've never entered tags.

He's right. You can install add-ons which provide metadata and SEO data for your blog/website, which allows you to fine tune what gets reported, but frankly, what they recommend changes so frequently it's not worth the effort of staying ahead of it. It's primarily a way for blogs to game the system and grab a higher 'readership' count than the actual number of people following their stories. Thus they take advantage of 'tricks' to fool readers into visiting their sites by accident, and once Google and Amazon catch on, they'll move on to the next trick to cheat the systems.

Ernest Bywater 🚫

@JohnBobMead

I'm not sure what to think about Ernest's argument about Google only accessing web site provided metadata.

Some of the metadata is automatically created and included in the page header, such as the page title and author. There are a wide range of other metadata information items that can be included or not set. The bots that scan the websites and pages check the metadata, page title (usually from the metadata) the url the page is on and any other urls on the page. Some will scan some of the content, but not all do that, and there is no consistent manner in how much of the content they scan or why.

Replies: JohnBobMead Switch Blayde

JohnBobMead 🚫

@Ernest Bywater

Some will scan some of the content, but not all do that, and there is no consistent manner in how much of the content they scan or why.

So without deliberately entering metadata, whether anyone will ever find the information in a given post is unlikely. So I need to learn how to attach metadata to my blog posts, if I think I'm saying anything someone else might find of interest. And learn to be concise in summarization for said metadata, and learn what the accepted terminology is. Hey! If I succeed, maybe I'll be able to post on Twitter!

Replies: Vincent Berg Ernest Bywater

Vincent Berg 🚫

@JohnBobMead

So without deliberately entering metadata, whether anyone will ever find the information in a given post is unlikely. So I need to learn how to attach metadata to my blog posts, if I think I'm saying anything someone else might find of interest. And learn to be concise in summarization for said metadata, and learn what the accepted terminology is. Hey! If I succeed, maybe I'll be able to post on Twitter!

Again, if you include an SEO tool on your blog (there are an entire variety of decent free ones, plus a bunch of incredibly expensive ones which do anything more for you), you can view what your current metadata is. There are even FREE metadata sites which will report what each site contains, so you'll know if you should make changes.

However, the SEO recommendations are continually changing, and it's NOT really worth it become an SEO master unless you're actually earning money from it. If you are, then it's easier to hire someone to handle all the SEO information for you at once.

For my own webpage, since I have two, a FREE site that allows SOL and ASSTR readers to read my story site AND a professional site where I promote my books for sale from, I DON'T want most of the material to be freely available via search engines (another reason for my using the pseudonym "Crumbly Writer"). Thus is really don't hurt me not bothering to properly code each chapter's content. Readers will either find it on their own, or it just isn't in my interest to tell them about it.

Ernest Bywater 🚫

@JohnBobMead

So I need to learn how to attach metadata to my blog posts

The Search Engine Optimisation (SEO) options available are more extensive than what I'm about to cover. However, the basic metadata for a basic html web page is usually something like below (note: I've left the < > out of the code so it shouldn't run):

head
meta http-equiv="content-type" content="text/html; charset=utf-8"
title Runaway! /title
meta name="generator" content="LibreOffice 5.3.4.2 (Linux)"
meta name="author" content="Ernest Bywater"
meta name="created" content="2017-08-25T00:11:36.526607882"
meta name="keywords" content="coming of age, crime, violence, action, youth, school, rivers region"
/head

The section between head and /head is the basic metadata.

The text between title and /title is the page title.

The real important part is in the section starting meta name="keywords" and the entries after the content= is the key metadata the search engines find and include in their databases.

In this example from one of my stories the keywords the page will be sorted on will be coming of age, violence, action, youth, school, rivers region, Runaway!

If there are other URLs on the page they will usually be found and listed as well.

Based on this example a search on rivers region or the other content keywords or title will find this page once it's put up and checked by a bot. However, a search on the main characters in the story will not find the page because they aren't listed in the keywords. If I were to add other words or phrases to the keyword contents, then the page would be listed in the search engine databases by them as well.

Depending on how you create the web page a lot of the metadata will be automatically created by the software you use, in some case the information needs to be entered into the page or some key spot in the source document.

In the above case the keyword content is entered by me in the properties section of the story file, and when I save the story file as html the information is drawn from their and added to the keyword content in the metadata.

Replies: John Demille Vincent Berg

John Demille 🚫
Updated:

@Ernest Bywater

The real important part is in the section starting meta name="keywords" and the entries after the content= is the key metadata the search engines find and include in their databases.

Actually, due to abuse of the keywords meta entry, generally now, search engines totally ignore the keyword entry.

The following is taken from here on the site. That's the latest in SEO as far as I know. I help Lazeez with the SEO parts of his engine.

< title>Erotica Sex Story: Discipling Mrs. Proudbum by harry lime< /title>

< link rel="canonical" href="https://storiesonline. net/s/76898/discipling-mrs-proudbum" />

< link rel="author" href="https://storiesonline. net/a/harry_lime" />

< meta name="title" content="Erotica Sex Story: Discipling Mrs. Proudbum by harry lime" />

< meta name="author" content="harry lime" />

< meta name="description" content="Erotica Sex Story: The students in Mrs. Proudbum's class were all 18 year old graduating students and they were well acquainted with the erotic elements of spirited coupling. The sex-deprived widow had not counted on the devious plans of a pair of twin blonde sisters and a young lad with a need to discover the joys of exploring the kinky desires of mature women rather than dealing with simpering teenagers." />

< meta name="created" content="2014-07-10" />

< meta property="og:locale" content="en_US" />

< meta property="og:type" content="article" />

< meta property="og:title" content="Erotica Sex Story: Discipling Mrs. Proudbum by harry lime" />

< meta property="og:description" content="Erotica Sex Story: The students in Mrs. Proudbum's class were all 18 year old graduating students and they were well acquainted with the erotic elements of spirited coupling. The sex-deprived widow had not counted on the devious plans of a pair of twin blonde sisters and a young lad with a need to discover the joys of exploring the kinky desires of mature women rather than dealing with simpering teenagers." />

< meta property="og:url" content="https://storiesonline. net/s/76898/discipling-mrs-proudbum" />

< meta property="og:site_name" content="Storiesonline" />

< meta property="article:section" content="Erotica" />

< meta property="article:published_time" content="2014-07-10T12:07:42-04:00" />

< meta property="article:author" content="https://storiesonline. net/a/harry_lime" />

< meta property="article:tag" content="Teenagers" />

< meta property="article:tag" content="Consensual" />

The really important part is the 'description' meta tag. Generally, the description should optimized to fit in the entry that a search engine displays under the title. It should be no longer than 155 characters. So do your best to make the description sound as good as possible within that limit. You want the search engine users to click on the link, so the description is the hook, just like on SOL, but due to the size allowance, Google et al don't show the 500 characters that SOL's engine allows.

Replies: Ernest Bywater

Ernest Bywater 🚫

@John Demille

John,

I did say the SEO was now much more extensive and I was offering only a basic example. However, the key point of what I was saying is the information used by the search engines to identify what to find a page by within their databases was not the page content or any expert shortcuts or terms unless they were listed as metadata. Thus complaining about documents about a particular expert term not being found isn't the fault of the search engine but is the fault of the people who created the relevant pages and didn't put those terms in the metadata for the bots to pick up for the database.

Vincent Berg 🚫

@Ernest Bywater

The Search Engine Optimisation (SEO) options available are more extensive than what I'm about to cover. However, the basic metadata for a basic html web page is usually something like below (note: I've left the < > out of the code so it shouldn't run):

Building on what Ernest outlined, the SEO info. is basically a computer database driven analysis of what titles are the most successful. So you enter a bunch of terms you think might be appropriate, and it'll spit out how many hits it generates (Note: You can also go to Amazon.com and type the same things in the search bar without bothering with SEOs at all!).

Generally, only 1 - 10 results mean the title isn't terribly desirable and no one is likely to ever request it. 500 - 1400 is prime, as there won't be a lot of competition, but there demand those keywords is high enough those looking will likely notice your page. Generally, if you tweak the search terms which best describe your book to avoid the most obvious, you'll hit a string of keywords which will pinpoint the most advantageous terms so you'll get noticed in a casual search, without being buried by 50,000 competing products.

Switch Blayde 🚫
Updated:

@Ernest Bywater

Some will scan some of the content, but not all do that,

When I google something, I get results where the words I'm searching on are on the webpage. So the (I forget what they're called — spiders? web crawler?) must scan the web page. I once was told they give more importance to what's on the top of the page.

Replies: Ernest Bywater

Ernest Bywater 🚫

@Switch Blayde

When I google something, I get results where the words I'm searching on are on the webpage.

As I said earlier, some of the web crawlers will scan some web pages. However, they do not scan the content of every web page. I ran some checks on phrases from the material on my own website and all content from the front page were there, but when I went deeper into the site the material beyond the metadata quickly reduced to nothing with a few exceptions which came up with hits on the same item from elsewhere. Which means they do not scan everything on every page page, which is what I said.

Ernest Bywater 🚫

Two points, for years I've been getting different search results on names with punctuation marks to those without. It's possible Google have recently changed their search code to eliminate such events. I'll look at that later.

However, you can not blame Google, or any other search engine, for how people list what they put on their web pages. In general, what you see in a search engine result is what happens after the search engine has checked it's database for what's been collected from the metadata of web pages and the web page's title. It matters not how others list something, it's how the creator of the web page listed it in the critical area of the page that allows it to be found.

Thus, if the church has been using a particular shortcut since the beginning of time, you will never find it through a search engine until after such time as someone has put that short cut into the metadata area of a web page. Then, the search will only ever find that page. That's why one of the most important aspects of web searches is to think how others would list something.

Geek of Ages

Why is "Ing." short for "Ignatius"?

Replies: JohnBobMead

JohnBobMead 🚫

@Geek of Ages

Why is "Ing." short for "Ignatius"?

Oh God. How long have I been going S. Ing. instead of S. Ign.?

Thank you for raising that question. You were correct, S. Ing. isn't an abbreviation for St Ignatius, S. Ign. is the abbreviation. S. Ign. is what showed up in the search request at ABE Booksearch which got me started on this.

Didn't change the results of the searches, except the confusion is with sign instead of sing. And knowing that, I first wrote "except the confusion was with sing instead of sing." "Augh," to paraphrase Charlie Brown.

Makes it ironic that my current project is a letter by letter transcription of "Vincentio Saviolo his Practise", a 1595 English translation of an Italian fencing manual.

Can't use a spell checker for a letter by letter transcription, especially given how irregular the spelling is in the original (at least three different spellings for some words), that only helps with part two, the normalized EModE (Early Modern English) spelling version, and part three, the modern spelling version. It makes it a lot easier if you use VARD 2 for the normalization process, it uses carefully developed dictionaries and substitution algorithms, it's free to use, but they want an academic email address to obtain a copy. My hope is that if I can present them with good clean copy of this project they'll let me have a copy even though I don't have an academic email address. Have to prove I'm a legit fellow traveler.

The kicker is that I'm not just going for a modern typeface for the text, I'm formatting the text, including images from the source .pdf, to match the original document, which is _not_ how this is done. I'm trying to reproduce the document as if it had been printed using Times New Roman, instead of that godawful Elizabethan typeface.

I'm doing this backward, I produced a modern spelling version formatted this way back in 2013, copy at John Mead's Academia.edu page and Internet Archive, which the Historical European Martial Arts (HEMA) crowd was happy to find out about, for them it being formatted to match the original makes it much more useful than a text document of the transcription, now I'm trying to recreate the intermediate documents for the use of the academics. And to establish my methodology for future transcriptions. I'm finding places where I misread the original document, so going over it again needed to be done.

It keeps me occupied. Being on Disability Retirement, I have lots of free time. It produces something of real use to certain interest groups, which makes me feel better about myself.

Dyslexics _know_ they have to refer back to the source materials. I'm constantly reminded of that, yet I keep messing up.

Replies: Vincent Berg

Vincent Berg 🚫

@JohnBobMead

The kicker is that I'm not just going for a modern typeface for the text, I'm formatting the text, including images from the source .pdf, to match the original document, which is _not_ how this is done. I'm trying to reproduce the document as if it had been printed using Times New Roman, instead of that godawful Elizabethan typeface.

I hate suggesting it, but your best bet, once you've captured all the individual character graphics, will be to hire a font creator to design a specialty font. It'll cost time and money up-front, but it'll result in a more professional product, both print and ebook versions, and you can also push the font as well as the book (depending on the arrangements you make with the designer). The size of the books will also be MUCH shorter. The print book won't be any smaller, but the print book submission will likely shrink from 20MB down to only 4 to 6.

Geek of Ages

Of course the content of the page is crawled to build the search index. It's ridiculous to claim otherwise, and it's trivial to verify.

However, page content alone is not used in google land, at least. They also pay attention to who links to you, and the words they use when linking. That's the whole innovative idea behind PageRank initially, that made them superior to their competitors back when they started.

Replies: Vincent Berg

Vincent Berg 🚫

@Geek of Ages

However, page content alone is not used in google land, at least. They also pay attention to who links to you, and the words they use when linking. That's the whole innovative idea behind PageRank initially, that made them superior to their competitors back when they started.

I've started using the PageRankings and 'readers also bought' links to either recommend other works to my readers, or to help advertise my books "Similar to xxx by xxx". There are now several websites where you can enter your ISBN, and it'll pop up a graphic of EVERY book that's been associated with yours via reader searches (i.e. readers who searched for or bought your book also bought). If there are a LOT of sales for any linked product, THAT's the one to draw parallels to, as more readers will likely have read it and will be more likely to trust your book (assuming you aren't lying out of your hat!).

Replies: richardshagrin

richardshagrin 🚫
Updated:

@Vincent Berg

"Similar to xxx by xxx"

Must be "much sex". (X rated is some sex.)

Ernest Bywater 🚫

CW,

That's all good info, but my posts have been as part of an effort to explain why searches on expert terms aren't often successful with search engines due to the terms rarely being used on a website where the web crawler bots will pick them up and index them. Especially when the terms are in book contents or pages deep in the website.

Reply to topic

Home | Forum | All Threads by Date

Forum Rules |

Forum: Author Hangout

Readership Demographics

WARNING! ADULT CONTENT...