Home ยป Forum ยป Bug Report and Feature Requests

Forum: Bug Report and Feature Requests

broken link parsing

Gauthier ๐Ÿšซ
Updated:

https://codepoints.net/U+1F4F1

https://codepoints.net/U+1F4F1

test

Replies:   Gauthier
Gauthier ๐Ÿšซ
Updated:

@Gauthier

typing an url as < a h r e f = "url" >url< / a > breaks the parser.

< a h r e f = "url" >some text< / a > works

Lazeez Jiddan (Webmaster)

@Gauthier

Fixed.

Replies:   Gauthier
Gauthier ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

yes, it works now

Gauthier ๐Ÿšซ

Another bug in the automatic link generator.
There are 2 bugs:
1 It remove preceding space making the link stick to the previous text.
2 It grabs trailing punctuation tested with comma) as part of the link.

Lazeez Jiddan (Webmaster)

@Gauthier

Made some changes that I hope help. This URL parsing thing is tricky.

Replies:   Gauthier  Gauthier  Gauthier
Gauthier ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

1 automatic link test https://storiesonline.net/

2 automatic link test trailing dot https://storiesonline.net/.

3 automatic link test trailing comma https://storiesonline.net/,

Replies:   Gauthier
Gauthier ๐Ÿšซ

@Gauthier

test https://storiesonline.net/s/10959:168869 test

Replies:   Gauthier
Gauthier ๐Ÿšซ

@Gauthier

test https://storiesonline.net/. test

Replies:   Gauthier
Gauthier ๐Ÿšซ
Updated:

@Gauthier

If you put the php code somewhere like pastebin I can take a look.

Alternatively, take inspiration by the masters of the problem: wordpress.

https://developer.wordpress.org/reference/functions/make_clickable/

You'll see the regex from hell to solve almost all the exceptions.

You can either adapt line 2146 to only allow http(s)? as a protocol prefix and rewrite the callback (rather evident) or follow the dependencies.

I would advise to take a look at the dependencies, as they promote security in a lots of place, by normalizing and limiting accepted data.

But you'll delve rapidly in their filter architecture, it's related to plug-in support and totally irrelevant for you.

Here are some dependencies:

_split_str_by_whitespace,

_make_url_clickable_cb

esc_url,

_deep_replace,

clean_url,

wp_kses_bad_protocol,

wp_kses_no_null,

wp_kses_bad_protocol_once,

wp_kses_bad_protocol_once2,

wp_kses_decode_entities,

_wp_kses_decode_entities_chr,

_wp_kses_decode_entities_chr_hexdec,

wp_kses_normalize_entities,

wp_kses_named_entities,

wp_kses_normalize_entities2,

wp_kses_normalize_entities3,

valid_unicode,

wp_allowed_protocols,

kses_allowed_protocols,

...

note that KSES is a recursive acronym which stands for "KSES Strips Evil Scripts".

So those are of particular interest to you.

Replies:   Gauthier
Gauthier ๐Ÿšซ
Updated:

@Gauthier

And that among other thing is why I told you that securing a forum is a huge task, and that I barely scratched the surface with my security tests. The number of attack vectors trough encoding, invalid unicode, entities is incredible.

Lazeez Jiddan (Webmaster)
Updated:

@Gauthier

You know, the simpler solution is to not try to make anything clickable ๐Ÿ˜ˆ

Anyway, I've made some changes.

Replies:   Gauthier  Gauthier
Gauthier ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

Much better,

Oddly Firefox gives problem on url containing & they are transformed in html entities in the href, that was mandatory for html4/xhtml and officialy shouldn't but should be tolerated with html5.

Apparently with html5 and firefox they are passed as entities to the server which then may fail. Didn't test with other browser.

Gauthier ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

Obviously, note however that removing all link won't invalidate all attack vector ;)

Lazeez Jiddan (Webmaster)

@Gauthier

Sigh!

Yeah, these things suck. Big time!

Gauthier ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

commas inside url automatic detect:
http://domain.com/something/0,123,3.html

Gauthier ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

Still a few issues, but it's already much better.

Gauthier ๐Ÿšซ

At least your server tolerate & to be replaced with & a m p ;

Lazeez Jiddan (Webmaster)

@Gauthier

yes, but that breaks the URLs.

Fixed, I think.

Replies:   Gauthier
Gauthier ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

Not really, the [] are striped of the url.

see:

/library/categ.php?key[]=humor&storyType=&contRate[]=5&iip=1&lib=&rf=&ff=&author=&score=&minSize=&maxSize=&p=&sort_field=story_score&sort_order=desc&lc=AND&cmd=Search

becomes:

https://storiesonline.net/library/categ.php?key=humor&storyType=&contRate=5&iip=1&lib=&rf=&ff=&author=&score=&minSize=&maxSize=&p=&sort_field=story_score&sort_order=desc&lc=AND&cmd=Search

which doesn't work.

yes, those regexp sucks real big time.

Lazeez Jiddan (Webmaster)

@Gauthier

Honestly, I don't care to support the URLs with square brackets.

Back to Top

 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.


Log In