I wonder what the statistical breakdown of all the thousands of stories on this site is. ie what percentage are 9.0, 8.0
etc. Would scores divide in logarithmic or arithmetic scale?
I wonder what the statistical breakdown of all the thousands of stories on this site is. ie what percentage are 9.0, 8.0
etc. Would scores divide in logarithmic or arithmetic scale?
With a premium membership you can search stories by score:
1-2 15
2-3 91
3-4 399
4-5 1878
5-6 8562
6-7 21582
7-8 12634
8-9 2635
9-10 152
total: 47948
stories available: 48329
The gap between the total and the number of stories available are probably the stories without a score that never had voting turned on. Also, you can not search from 0 to 1 but seeing the very low number for 1-2 that are probably 0-15 stories.
It's obvious that the peek is in the stories scoring 6-7.
These numbers are valid for today, Thursday April 8, 2021.
Also, you can not search from 0 to 1
Theoretically searching for stories posted between 1970-01-01 and 2021-04-08 and including category exclusions ought to bring up every story on SoL. It doesn't, it brings up 40 thousand and change. Sorting the results by score seems to exclude any story without a score or with less than a 1.11 score.
Sorting the results by score seems to exclude any story without a score or with less than a 1.11 score.
I don't think there are any stories below that score. As Dominions Son pointed out, stories with less than 20 votes aren't displayed so it is possible that there a few stories below 1.11 but without the required 20 votes for display.
Theoretically searching for stories posted between 1970-01-01 and 2021-04-08 and including category exclusions ought to bring up every story on SoL.
A fairly recent change was made to SOL so that stories that don't have public scores are not shown on lists that are sorted by score.
You currently (April 8, 2021) can use score as a criteria and sort by name which will provide the number of stories in a certain range.
Also, including the category exclusions does not affect the 'characters 13 and under' filter.
A fairly recent change was made to SOL so that stories that don't have public scores are not shown on lists that are sorted by score.
If that is the intent then it isn't working in practise.
Note the search was by date. The results were then ordered by date, changing to ordered by name does not change the number of stories listed. If what you say is correct, then it should.
1) Go to My Account.
2) in the Preference section there is an 'Access Level'
- There are two levels:
(a) Full Access - which includes characters under 13
(b) Filtered Access - does not include stories with characters under 13
There is 2488 stories that appear in the 'full access' that don't appear in the 'filtered access'.
If, once a search criteria is defined (say after 1999-01-01) you can choose to sort the stories selected by the following criteria: title, date, update, size, dnlds (downloads), votes, score, author.
You will get the same number of stories if you sort them by title, date, update, size, dnlds (downloads), votes, author.
The only time you will see a difference is when your selection criteria is 'score' in which case those stories without a visible score (either less than 20 votes or the score is hidden by the author) are not shown.
Agreed, though it does nothing to advance the discussion.
Unless of course every story with no votes or voting disabled also gets filtered by dint of including under 13 characters.
Which I doubt.
There is 2488 stories that appear in the 'full access' that don't appear in the 'filtered access'.
That's all? Since many authors came from ASSTR, the 'underaged' stories were never that popular here, but still, that number seems abnormally low. I'm guessing either the authors yanked their underaged stories, or they were physically removed for one reason or another. Statistically it makes sense that it would be small, since so much has been written since the Canadian law was passed, but there should still be more from that date and before.
Isn't management culling some of the older ones judged to have negligible literary merit?
AJ
the stories without a score that never had voting turned on.
It's possible for a story to have voting turned on and not display a score. If there are fewer than 20 votes it won't show a score
It's possible for a story to have voting turned on and not display a score. If there are fewer than 20 votes it won't show a score
Yep, that's also a possibility.
I thought that the scores were adjusted so that 6 would be the average. So I would have expected about the same number of story above six as below. But that doesn't seem to be the case. It looks like there are many more stories with scores above six than below six.
I thought that the scores were adjusted so that 6 would be the average. So I would have expected about the same number of story above six as below. But that doesn't seem to be the case. It looks like there are many more stories with scores above six than below six.
Remember that what I listed as "6-7" are scores 6.0, 6.1, 6.2, etc. I'm not sure what the weighted average is but Lazeez explained that it is relative to all scores in a certain period so 'average' may change over time. Or something like that :D
I thought that the scores were adjusted so that 6 would be the average. So I would have expected about the same number of story above six as below. But that doesn't seem to be the case. It looks like there are many more stories with scores above six than below six.
It's definitely not a uniform distribution, which is good, as it shows that, aside from the kinky stroke stories, there are more outstanding stories than sub-par tales.
you can not search from 0 to 1
Since the minimum score is 1, it is very difficult to explain how an adjusted score could be less than 1. Scoring is moved toward 6 from both high scores and low scores. If a story got only 20 votes of 1, its score likely would be 1, but that seems a very unlikely event. "Hated it" seems to me to be a more likely result than "You call this a story?"
Since the minimum score is 1, it is very difficult to explain how an adjusted score could be less than 1.
Yep, I didn't realize the scoring started at 1. As a programmer I'm used to starting at 0 :D
As a programmer I'm used to starting at 0 :D
I am also a programmer. AFIK of all the programing languages out there, only C & C++ do that.
I am also a programmer. AFIK of all the programing languages out there, only C & C++ do that.
As far as I know most are 0-based. This site: https://iq.opengenus.org/array-indices-start-from-1/ says there are 20 programming languages that start array indices with 1. Considering the number of programming languages out there that's minor. The funny thing is that FoxPro and Cobol are listed. It's been a very long time but I programmed a lot with both but don't remember starting indices with 1 though.
This site: https://iq.opengenus.org/array-indices-start-from-1/ says there are 20 programming languages that start array indices with 1.
I know from direct personal knowledge that that list of languages that start indices with 1 is incomplete.
I know from direct personal knowledge that that list of languages that start indices with 1 is incomplete.
Most likely yes, it was the first result with a quick search. The fact that it's possible at all to have such a list indicates that the number is limited though. It doesn't matter really, as long as you know what the base is for the language you use :D
first result with a quick search. The fact that it's possible at all to have such a list indicates that the number is limited though.
The number is limited not because there are more languages that start at 0 but because the total number of programming languages is finite.
The only languages that article mentions that start at 0 are C and C++ it doesn't even attempt a list of that side of it.
Without a complete list of both, no valid conclusion can be drawn from that article about which is more prevalent.
Off the top of my head: Java, JavaScript (no relation), PHP, Perl, Python, and Pascal all 0-index by default, though in some cases you can make them behave otherwise.
(For example, PHP's arrays are all actually hashes under the hood, so you can set them to whatever... you're not even limited to integer indexes. But they still default to 0 if you don't explicitly set them otherwise.)
So that's 8 (counting c and c++) that start at 0 and a list of 20 that start at 1 to which I could add 1 or 2 more (though granted, one is fairly obscure).
Here's a nice StackExchange discussion: why are zero based arrays the norm. It references an interesting article by Edsger W. Dijkstra that is also referred to in this discussion: http://xahlee.info/comp/comp_lang_array_index_start_0_or_1.html.
What it comes down to is that programming is basically math and math starts with 0. It's the every day human counting that starts with 1.
What's the 0th letter of the alphabet? ;-)
One has nothing to do with the other so it's an irrelevant question ;)
So you prefer implementational purity over functionality?
I prefer to think of programming languages as tools, to do man's bidding rather than to be man's master. So, unsurprisingly, I prefer higher level languages.
AJ
So you prefer implementational purity over functionality?
Where did I say that? I think I tried to make it clear that I consider 0-based as of higher functionality. Preferring higher languages or not has nothing to do with that, most higher languages use 0-based too.
I consider 0-based as of higher functionality
What's the 0th letter of the alphabet?
What's the 0th entry on your bank statement?
It's not how the real world functions, so it's less easy to map real world concepts onto it.
AJ
What's the 0th letter of the alphabet?
What's the 0th entry on your bank statement?
It's not how the real world functions, so it's less easy to map real world concepts onto it.
You don't get it. Programming is a technical and logical exercise and not a "real world" function even if the resulting product is for "real world" usage. Even with a 1-based language it needs the zero for a multitude of functionalities. You don't "map" real world concepts, you create functionalities that support real world concepts in a way the user understands, which has nothing to do with how the programmer creates it. If a programmer requires a 1-based programming language to understand what he's doing, he'd better start looking for a different career because he's not going to last long. Every programmer I know, knows how to deal with both systems without a thought.
You don't "map" real world concepts
I do. In fact, I don't just map real world concepts, I map real world objects.
I do application development for a geospatial information system. :)
If a programmer requires a 1-based programming language to understand what he's doing, he'd better start looking for a different career because he's not going to last long.
Since this is personal experience it's only anecdotal, but one of the most frequent mistakes in C/C++ applications result from the index for an array being out by 1 because the programmer dealt with the transformation from computer language to something tangible in the real world 'without a thought'.
AJ
Since this is personal experience it's only anecdotal, but one of the most frequent mistakes in C/C++ applications result from the index for an array being out by 1 because the programmer dealt with the transformation from computer language to something tangible in the real world 'without a thought'.
One of my instructors in college even had a name for those kinds of one off indexing errors. He called them "fence post errors".
He described the problem like this:
You need to build a fence with n sections. How many fence posts do you need.
Many are tempted to say N posts. This may or may not be right, you don't have enough information yet.
Each fence segment needs two posts and can share a post with the next/previous segment.
If it's a "closed" fence, that is a fence that fully encloses an area with the first and last segments connected to eachother,you need N posts for N segments.
However, if it's an "open" fence, and the first and last segments don't connect then you need n+1 posts for n segments.
A fence post error means you were either counting for an open fence when you had a closed fence or counting for a closed fence when you had an open fence.
Since this is personal experience it's only anecdotal, but one of the most frequent mistakes in C/C++ applications result from the index for an array being out by 1 because the programmer dealt with the transformation from computer language to something tangible in the real world 'without a thought'.
Never encountered the problem, not myself and none of the programmers I met in my 35+ years of software development. If you're programming and dealing with an indexed collection or array you're working at a level that at that point has no relation to anything remotely connected to the "real world". It's used to perform a task in the logical flow of a program. If you're working on a UI you might get a little closer but not in the 'workhorse' code behind it. The only situation I can think of is if you have to work with two programming languages at the same time with a different base. Even then I doubt I would have a problem with it.
no relation to anything remotely connected to the "real world".
My C/C++/Java days (long past, thankfully) were in commercially-oriented environments which tend to be all about humping data from one place to another, so there was always real-world involvement, from screen formatting to databases and other persistent storage.
But even nowadays I encounter plenty of situations where the misalignment of a 0 based language with the real world is probably the reason for a wrong answer eg the number of new e-mails count is out by 1, or the number of new notifications count is out by 1.
AJ
But even nowadays I encounter plenty of situations where the misalignment of a 0 based language with the real world is probably the reason for a wrong answer eg the number of new e-mails count is out by 1, or the number of new notifications count is out by 1.
I'd say that's more likely not any array base confusion, rather that most people don't realize that if you go from the Kth to the Mth element of a list, you've got M-K+1 elements - the traditional out-by-one error.
But even nowadays I encounter plenty of situations where the misalignment of a 0 based language with the real world is probably the reason for a wrong answer eg the number of new e-mails count is out by 1, or the number of new notifications count is out by 1
Which all rolls back to bad programming/testing.
Back in the day, my job was essentially 'operating system generalist' for one flavor of Unix (this is pre-Linux, or at least Linux as a commercial venture). Get assigned a problem, fix problem, move on to next problem.
The problem I was given was interesting. A customer would run a LISP environment, and it would fail with an insufficient memory error. They would run it again, and it might start, or not. Usually by the third try it would start. It'd run for quite a while, and then the system would crash hard.
I traced this to an off-by-one error in the virtual memory system which resulted in more VM space being 'available' after the first failure (and still more after the second). There was not, in fact, any more VM space, thus the hard crash.
So, I fixed the error and tested it. Now it would fail, and fail, and fail. After several loops, the system ran out of VM space entirely and everything failed to allocate even trivial amounts of memory.
Debugging that I found an off-by-one error in the other direction.
It turned out there were six nearly offsetting off-by-one errors by time the entire mess was worked out.
The customer wasn't happy, of course, because instead of being able to run LISP for a while, now they couldn't run it at all (without buying more memory; had to be physical, not just more backing store/hard disk space, too).
It's not how the real world functions, so it's less easy to map real world concepts onto it.
Actually it is. IF you look at it in context.
The original 'computers' were basically a large connection of switches, each of which could either be 'off' ( 0 ) or 'on' ( 1 ) so the zero isn't actually a mathematical value but the status of a real world switch.
Programming language has evolved but binary is still binary.
Now for the sex part.
Switches are acknowledged to go both ways, but if they only go the way they are told to go, are they really switches or just submissive?
Should a switch room be properly referred to as a substation?
:)
Switches are acknowledged to go both ways, but if they only go the way they are told to go, are they really switches or just submissive?
Should a switch room be properly referred to as a substation?
Subs allegedly hold the power in a dom/sub relationship so perhaps it should be a power station ;-)
AJ
Only in fiction.
OMG! Does that mean that not everything I read in SOL stories is gospel truth?
AJ
OMG! Does that mean that not everything I read in SOL stories is gospel truth?
You need to go to Saskatchewan and find Sirus' Olde Laundry if you want to meet Goose Pelt Ruth. :)
OMG! Does that mean that not everything I read in SOL stories is gospel truth?
You can read now..??
We thought you stuck with the illustrated stories.
Not that they were stuck before you got to them...
:)
Depends if 'performance' is important in the 'functionality'. And I mean absolute, full-out, wring-every-ounce-out-of-the-system performance.
If so, you can't beat lower-level languages in the general case. C or a subset of C++ are usually the sweet spot for high-performance code. Obviously assembly language is the ultimate, but very, very few people can craft assembly language as tightly as a good compiler will manage, and (with the exception of extremely critical routines) it's simply not worth the developer time to try.
For anything else, I agree with higher-level languages. But I've been a systems programmer almost all of my career, so what I'm best at is C or the parts of C++ that don't have high performance tradeoffs, do memory allocations under the covers, etc.
I always remember the quote that the C family of programming languages is "write once, read only". I regard them as quite low level because of their distance from English.
AJ
I prefer to think of programming languages as tools, to do man's bidding rather than to be man's master. So, unsurprisingly, I prefer higher level languages.
I've always preferred the higher level languages, both for speed of development and simply because it's hard to include artificial decisions-based programs without one.
But in most cases, when you get the basic algorithms established, you'll convert to some version of Assembler just for speed, using as few commands as possible, as each iteration and/or complication slows down the computation speeds.
If you've never coded in binary, you wouldn't understand the difference it makes.
If you've never coded in binary
I've written Z80 Assembly Language routines but nowadays coding at such a low level is harder and chip speeds are faster so it's not really worth my while :-(
AJ
I've written Z80 Assembly Language routines but nowadays coding at such a low level is harder and chip speeds are faster so it's not really worth my while :-(
Z80, 6502, TMS-1000, PDP-11 Assembler, 4004, 8080 - fun, but yeah, no longer worth it.
What's the 0th letter of the alphabet? ;-)
To find it, follow these instructions;
1. Heat then pour the contents of a tin of "alphabetti spaghetti " into a bowl.
2. Look at the pretty, tomato flavoured letters.
3. Eat ALL of them, even any part letters.
4. Lick the bowl clean.
5. Study the bowl carefully because it now contains the 0th letter of the alphabet.
:)
What it comes down to is that programming is basically math and math starts with 0.
No, that's not about math. It's a consequence of implementing array's as a hack on top of a memory address pointer rather than having a real array structure. Your array index isn't really an index, it's an offset from a memory address.
The article talks about the math being easier/simpler with a 0 based index. But that's only true if you are doing pointer manipulation not dealing with a real indexed array.
You can add Bash (the most common Linux shell) to the 0s, and probably most/all of the other Linux shells. Even M$DOS batch, although IIRC there are no user-manipulable array structures; %1, %2 etc are the command line arguments and could be considered an array, %0 is the name under which the batch file was invoked, although there is no indexing mechanism to select the Nth arg (not counting SHIFT) so regarding M$DOS arguments as an array is dubious at best.
I'm fairly sure that DEC's BASIC+ used 0-indexed arrays, and think that BBC-Basic did as well, though I'm fairly sure that M$ QuickBasic was 1-indexed. The various assembly languages are 0-based, if they have any concept of arrays, and even the processors - 6502 or Z80, the index register was added to the offset, no extra 1.
Indeed, it's possible that people don't even realise that they are using a 0-based array; if you are using a 1-based array and reference element 0, you get an error, but if you are using a 0-based array and only reference elements 1 up, you don't get an error for not using element 0; it only shows up if there's a non-indexed method of accessing the array, like
ptp=( $( LINES=10 top -n 2 -d 0.1 | tail -n 9 | head -n 1 ) )
whence ${ptp[7]} gives you the processor idle time %ge, as it's the 8th item in the 3rd line of the reduced output from 'top'.
though I'm fairly sure that M$ QuickBasic was 1-indexed.
It was, as was it's big brother MS Basic 7.x PDS (Professional Development System). This continued with Visual Basic until they added with VB6 (or was it already with VB5?) some new data types which defaulted to 0-indexed. But you could still change all arrays to 1-indexed or 0-indexed by setting OPTION BASE 1 or OPTION BASE 0.
Until then any variable was automatically intialized to a default value (0 for numerical variables), but those new data types needed to be initialized separately! (as in C).
BTW, accessing sequentially large parts of two-dimensional arrays on disk could cause performance problems if the programmer wasn't aware how his programming language handled it (by column or by row). IIRC, MS-Basic and Pascal did it differently.
HM.
That was not, nor was it intended to be, a comprehensive list. The mere fact that I can come up with, off the top of my head, half a dozen major, widely-used (okay, maybe not Pascal so much these days) languages that 0-index and are not C or C++ or even C# or Objective C means that "as far as you know" is not actually very damn far.
Here's a Wikipedia page that compares how various programming languages handle arrays.
I count 37 that 0-index, and 20 (possibly the same 20 as Keet's list; I'm not going to go to the effort of going through and comparing) that 1-index, plus a couple of oddballs. I know both lists are incomplete; I've used both 0- and 1-indexing languages that aren't included (notably, every Unix shell that supports arrays belongs in the 0 list, while at least some SQL dialects belong on the 1 list).
But more relevant than the raw number is the fact that practically every major modern programming language 0-indexes. The 1-indexing list is mostly old (FORTRAN, COBOL), obscure (though to be fair a lot of the 0-index ones are too), or not designed as general purpose programming languages, and intended for use by people who are not primarily programmers (Mathematica, FoxPro).
C or C++ or even C# or Objective C
They are all descendants of a single parent I believe, as are the many variants of COBOL (which tend to end in -OBOL).
FWIW I've used a couple of high-level languages oriented towards database and screen manipulation that most here probably haven't heard of. Both are/were 1-based. Perhaps that's significant - the higher-level the language, the more likely it is to be 1-based.
Does anyone know which category LISP fits into? I only used it a couple of times as a trial, but it didn't prove suitable for the work required of it.
AJ
LISP is zero-based. It can also have a zero-dimensional array (which has one element).
Perhaps that's significant - the higher-level the language, the more likely it is to be 1-based.
That's not true. For the same reason I could say that the higher the language, the less it is focused to professional programmers. Just as unfounded. It's more likely that the closer a language is to programming UI's it's more likely to be 1-based. The 'highest' languages often have another problem: they fail to offer fine control over parts of the offered functionality. In the end it's what you know and are comfortable with that is the 'best'.
The 'highest' languages often have another problem: they fail to offer fine control over parts of the offered functionality
That's something I agree with. I recall from my COBOL programming days lots of frustration at not being able to do things that were easy in lower-level languages.
AJ
That's something I agree with. I recall from my COBOL programming days lots of frustration at not being able to do things that were easy in lower-level languages.
The problem with COBOL is that it's functionality makes it almost exclusive for programming administrative software. You can do some basic math but don't expect anything high level math. On the other hand it's VERY structured, if you like that. It's definitely very reliable.
almost exclusive for programming administrative software.
I did quite a bit of banking and finance work in it. It's not suitable for the sort of maths I do in my research nowadays, but it's fine for everyday stuff.
AJ
I did quite a bit of banking and finance work in it. It's not suitable for the sort of maths I do in my research nowadays, but it's fine for everyday stuff.
If you're still willing to work with COBOL you can earn tons of money in the banking industry. Lots of COBOL software there and they can't find enough programmers to maintain it.
you can earn tons of money in the banking industry. Lots of COBOL software there and they can't find enough programmers to maintain it.
Largely because the BCD(Binary Coded Decimal) number system that allows mathematical operations on non-integer currency values without the imprecision of floating point numbers.
All floating point numbers are approximations and will show error if you display them out to enough decimal places.
BCD numbers are exact.
On the other side of the equation, BCD numbers are relatively expensive in terms of system memory compared to even double precision floating point numbers.
If you're still willing to work with COBOL you can earn tons of money in the banking industry.
Okay, I get the message - you don't like my stories. In which case you'll absolutely hate my next one, if I ever manage to finish it ;-)
AJ
Okay, I get the message - you don't like my stories. In which case you'll absolutely hate my next one, if I ever manage to finish it ;-)
I couldn't tell because I haven't read any of your stories, yet. Most are much shorter than I tend to read. SOL offers me such a huge reading list with many stories that I like to re-read too. I have Gay! on my reading list but it will take some time before I get to it.
On the other hand, programming COBOL will earn you (and any other author on SOL) a lot more than writing :D
"Hated it" seems to me to be a more likely result than "You call this a story?"
Except 1-votes are often delivered as punishment for writer's broaching ideas that they object to (i.e. scat, water sports, politics, the wrong kind of politics, gay or stories about minorities). But very few readers will ever vote 1 for any story, as the tendency is, if you really dislike a story, is simply not to vote on it at all. You have to really dispose the issues raised to ever cast a vote below 3.
Typically, once you've offended someone, they'll continually post 1-votes as an ongoing protest for simply still breathing, but they'll adjust it over time when the author writes a particular good chapter--which demonstrates that a 1-vote does NOT indicate that they don't consider the story unreadable. Often, the more popular and/or expansive a story is, the more 1-votes it'll generate.
there was an extensive discussion on this a few times a couple years back and twice I did an extensive analysis which is available at:
https://www.dropbox.com/s/3h3e30rrfavko27/SoL%20stats%20April%202018.jpg?dl=0
https://www.dropbox.com/s/51wx2c3ak0c418v/SoL%20stats%20July%202017.jpg?dl=0
At the time I did the analysis in the 2 files I gave links to the totals in the files matched the total number of files on SoL. However, in 2018 while 5.41% were in the 8-9 group and 0.35% were in the 9-10 group there were 1.91% of the stories had no scores at all. Naturally the biggest group was 6-7 with 43.30%
Since all of those rates were fairly close to the 2017 percentages for the same items I suspect the rates are still similar to the 2018 ones.
That list of 1-based languages is far from complete.
Zero-based indexing should have disappeared when people moved on from coding in machine language on the "bare metal".
Just my opinion, and as usual, I am right.
the "zero versus one" dichotomy extends into the platforms used for industrial control. I was never formally schooled in this stuff. I learned it in self-defense over years of dealing with the growth of technology in the heavy industry and utility side of things.
Once I realized it (and many, many others didn't) I held a tool that solved many issues in a very satisfying fashion, as in "You spent HOW many days looking for the problem?"