HTML Entity Character Lookup

May 31st, 2007

Using HTML entities is the right way to ensure all the characters on your page are validated. However, often finding the right entity code requires scanning through 250 rows of characters.

This lookup allows you to quickly find the entity based on how it looks, e.g. like an < or the letter c.

HTML Entity Lookup

Lookup
Options

Dashboard Widget

Dashboard Widget Screenshot

The HTML entity lookup is also available as a Dashboard widget.

The widget works in the same way the web version does, and does not require an Internet connection to function.

Clicking on the particular row will copy the html entity code to the clipboard.

Download the dashboard Entity Lookup widget

Features

  • Search for entity characters based on how they look (taken from the W3C list of entities)
  • Switch between standard and compressed views
  • Copy the HTML entity to the clipboard
  • Add your own keyword terms and characters to entities
  • Settings stored in a browser cookie
  • Available to be installed as a search plugin
  • Available as a Firefox plugin - thanks to Yining

To reset the keywords, clear your cookies for this page and the default keyword dictionary.

How it Works

The lookup searches the html entities for matches to the searched character based on how your character looks. For instance, the letter c would match © and ¢ entity, because of the way they look.

There's no clever logic behind this, only the most powerful computer known to man - man's own brain.

Each entity has had a list of 'like' matches added to them by hand and eye. This is stored in a local dictionary file and loaded in during start-up (since it's so small there's no point in using an AJAX like solution).

The entity lookup also supports word searches and multiple searches space separated, such as copy and cent.

Comments, feedback and suggestions are welcome.

Comments

New comment Mindables said on May 31, 2007:

Pretty nifty! I'll probably end up using the widget more than the site, however.

New comment Sherlock said on June 1, 2007:

the "Done" button shouldn't be put far off on the back.

New comment Remy Sharp said on June 1, 2007:

@Sherlock - I think I know what you mean - i.e. if you have results shown, the done button is much lower down. If so, I've fixed that now.

Also - I've made a small change to the lookup, in that it will match on partial matches, i.e. searching for 'cop' will return the &copy; entity.

New comment Gudbergur said on June 4, 2007:

Very cool!

New comment Drew Marsh said on June 4, 2007:

Great work! Any chance you could make a Vista "gadget" out of it as well? Love to have this on my desktop.

Cheers,
Drew

New comment David Janes said on June 4, 2007:

Very nice. Consider adding the zero width space in for the space character also:
http://www.fileformat.info/info/unicode/char/200b/index.htm

New comment wade said on June 4, 2007:

the 'ajax' side of this could be interesting as you could allow the dictionary to be maintained wiki style. i.e. I would match 'at' and '@', maybe a ranking component...
A bit of work, but the community aspect would be neat

New comment Remy Sharp said on June 4, 2007:

@David - I've tweaked the dictionary to match the zero width spaces on the space character - try searching for ' ' (i.e. an empty space).

@Wade - definitely. I've been trying to think about how I could expand the tool to accept community input on the matching. I've got a few ideas which I may explorer soon. I'll let you know.

@Drew - I'll have a look in to it, but I can't promise anything - I'm a Mac user so I was already familiar with Dashboard. Vista gadgets I suspect are a whole different ball game - but I will investigate :-)

New comment Eric Jablow said on June 4, 2007:

I know they aren't well-supported yet, but could you add the other Unicode spaces—the various sizes of thin spaces, the number space, and also the invisible operators appearing in the Unicode math block?

New comment Ed Knittel said on June 4, 2007:

How difficult would it be to make this into a Yahoo! widget so that it could be used on a PC? There's no information about a copyright with this which is why I ask. If the Dashboard widget works with requiring an internet connection I believe it would be possible to make something identical for PC users. If you can make the Dashboard code available I would very much like to make a Yahoo! widget.

New comment Remy Sharp said on June 4, 2007:

@Ed - I've had a quick look at Yahoo! widgets (it's Konfabulator - right?) - and I can't see it being that hard to convert the widget. If you don't mind, I'll take this offline because there's some enhancements which the widget should benefit from, i.e. some kind of shared/self updating dictionary.

New comment : ) said on June 4, 2007:

Where is @ when I type a : )

New comment Ed Knittel said on June 4, 2007:

Yahoo! Widgets is indeed the old Konfabulator. When you establish what the next gen will be I'd be happy to discuss it with you offline (I looked at the JS entity data and it seems very straight forward).

New comment Remy Sharp said on June 4, 2007:

Eventually I'm going to trip up on this, but it's worth pointing out that the following two don't have html entity substitutes:

  • @
  • $

I'm sure there's more, but the dollar, in particular has been queried before.

The data is based on the W3C list of entities and I checked the XHTML 1.0 DTD too.

Personally, I always thought the $ did have an entity for it!

New comment Ed Knittel said on June 4, 2007:

Also, I disagree that it should be a wiki-type dictionary. That would require the widgets/dashboard to have internet access. There are only so many HTML entities http://www.w3schools.com/tags/ref_entities.asp
It's just a matter of adding in the rest.

New comment dasickis said on June 4, 2007:

I think that -> should bring up the right arrow as well as >.
<-> = double arrow
<- = left arrow

New comment Henrik N said on June 4, 2007:

Great work; thanks! I would like it even more if I could copy the glyph (as UTF-8) to clipboard from the widget and not just the HTML entity. Perhaps as a toggle on the back of the widget, or depending on what column you click.

New comment David said on June 4, 2007:

I'd match lower case rho (ρ) with "p" instead of (or in addition too) "o".

Still, very nice.

New comment Dashifen said on June 4, 2007:

A Google Gadget for this would be handy as well.

New comment Jake said on June 4, 2007:

suggestion: show '$' when 's' is requested.

New comment Darren Embry said on June 4, 2007:

Nice.

Now if I could right-click → Add a Keyword for this Search in Firefox, this would be perfection.

New comment Reno said on June 5, 2007:

Smashing fine tool, been cranking out the usage on this for the last few minutes and still playing with it.

New comment Brett Taylor said on June 5, 2007:

You don't have macron letters in there: Ā etc.

New comment margus said on June 5, 2007:

Awesome widget, I love it. Found a weird thing though. Searching for "bullet" gives an empty result. Yet, searching for "mark" results in trademark.
Seems to be a bit inconsistent, but would be awesome if the search also included the description words.

New comment Martijn v/d Ven said on June 5, 2007:

Love it, especially the way you can call up "right-to-left" and "left-to-right" by just typing "><"/"<>".
Maybe there would be a way to, later on, define them by simple strings aswell.
For example; "*" gives a bullet, a dot and other things that might be used as bullets. Making a real list of bullet-like entities come up when typing "bullet" or "dot" might be a nice addition.
At the moment "bullet" doesn't give me anything, and I'll have to stick with just "bull". This is a bit silly when searching for something that is not called a "bull" in normal speak.
That way, adding some strings might smoothen the useability.

My 2¢

New comment John Mitow said on June 5, 2007:

Awesome tool, I will use it more than once I think.
Thank you

JM

New comment Calítoe.:. said on June 5, 2007:

It looks great, but it would be even greater if it found characters from languages like Romanian and Polish... For example, ă or ę

New comment Remy Sharp said on June 5, 2007:

@Calítoe + Brett - could you point me in the direction of the entity lists for these characters so I can include them. Thanks.

Also - I've included the word 'bullet' to match and will hoping to make some interface changes that will you a touch more control over the output.

New comment John Mitow said on June 5, 2007:

what abou the И?

New comment Graeme Mathieson said on June 5, 2007:

Very shiny. We were having a discussion in the office about the automated way in which you could implement this, by shifting all the characters into their composed forms, then looking at each of the characters that provided the composition.

Then we read the 'how it works' and, well, that works much better. :-)

New comment abhisek said on June 5, 2007:

hi! i downloaded the widget. but the Entities.html is not working. can you help me?

New comment Dan G. Switzer, II said on June 5, 2007:

I'd love to see the list provide full UTF-8 matching. There are a ton of useful entities in the UTF-8 charset. Plus then you could in additional meta data for "shapes", "lines", "icon", etc.

New comment Dan G. Switzer, II said on June 5, 2007:

Oh yeah, this was a great idea!!!

New comment Remy Sharp said on June 5, 2007:

A few new features based on feedback:

  • Ability to set your own keywords or characters. Clicking on the 'add' image will allow you to edit the linked keywords. This is stored in your cookie, so if you want to reset you can just clear your cookie for this page.
  • Copy the entity to clipboard. IE, sadly, prompts each time you want to copy - but it works. Other browsers will use the flash clipboard widget, and if flash isn't install, for Firefox only, it will fall back on another method (which includes instructions on how to enable).
  • Compressed output option.

Any feedback you may have, please let me know.

New comment Michael Mahemoff said on June 6, 2007:

Remy, great idea.

I added some feedback on Ajaxian, but also I hope you can do some SEO so I will find this if I search for "html unicode" or "html character codes" etc.

New comment Paul Silver said on June 6, 2007:

Very handy tool (and widget), thanks for mentioning it on BNM

New comment abhisek said on June 6, 2007:

how to use the widget? please reply, i seem to be a dumbo. the entities.html is not working for me.

New comment Remy Sharp said on June 6, 2007:

@abhisek - What's the problem with the widget? You shouldn't be accessing the entities.html directly as this is a Mac OS X dashboard widget - it should install from the .wdgt file. I'll follow up offline.

New comment spen said on June 6, 2007:

Very smart tool. Would it benefit further by having hex codes available also?

New comment john cooper said on June 6, 2007:

damn that's so usefull! :)

New comment JP said on June 7, 2007:

Very usable tool with excellent, clear results returned. Awesome.

New comment Calítoe.:. said on June 7, 2007:

Remy, thanks for taking into account our request.
Here's a list of the HTML numeric entity codes of the Unicode "Latin Extended-A" block. If you include those characters, I think all languages that use the Latin alphabet as a base will be covered.
http://www.pemberley.com/janeinfo/latin1.html#latexta

New comment Ray Cheung said on June 9, 2007:

It is a very handy tool. Thank you

New comment Matthew R said on June 10, 2007:

+1 for Yahoo/Konfabulator widget. Please.

New comment Remy Sharp said on June 12, 2007:

A few new features added:

  • Support for Latin Extended block via 'Incl. Extended' check box - which is remembered via sessions.
  • Total matches found shown
  • Small tweak to speed up search and results

New comment Dimitry said on June 12, 2007:

Awesome tool! Thanks for making this. Noted here:
http://www.youtilize.com/post/html-character-codes-made-easy

New comment KJ said on June 15, 2007:

Great work. One thing that frustrated me was the fact that I've been looking for entity codes for various Polish language special characters and this widget lists none of them :( lol. Does any one know of an alternative source from where I can find these?

New comment Remy Sharp said on June 15, 2007:

@KJ - if you send through a link with the polish entities, I'll included it in the extended character set.

New comment Jermayn Parker said on June 18, 2007:

kewl :)

Always get caught out trying to find the character needed

New comment Ctrl-F5 said on June 25, 2007:

Hi, as everyone above i love this lookup you have created, it saves time and ive got rid of all of the printouts i used before.

One entity i have noticed you dont have is the curly brackets/braces ({}) &123; + &125;

New comment Calítoe.:. said on June 27, 2007:

KJ and Remy: With the Incl. extended feature (thank you ;)) the Polish characters are included. As I said when I posted the link to the Latin extended A block, I am pretty much sure that all European languages that use the Latin alphabet are included there.

New comment Steve Clay said on June 27, 2007:

With a little more Javascript you could allow this to be used a search engine in FF or Opera. On onload, just check document.location.search for "?q=something", unescape something and run a query on it.

New comment Steve Clay said on June 27, 2007:

Scratch that last comment. Apparently your CMS won't allow a querystring appended to the URL.

New comment Remy Sharp said on June 27, 2007:

@Steve - excellent idea. It's now implemented. Try this link for example

I've also added it as an opensearch plugin - so if you're using Firefox or IE7 - your browser should auto-detect the plugin to allow fast html character code lookups directly from your browser search box.

New comment Jacques said on June 28, 2007:

Your OpenSearch plug-in works nicely. I've added it to my Firefox search engines.

New comment Tobz said on June 29, 2007:

This tool is great. saves quite a bit of time.
It would be great if you made this into a search plugin for Firefox

New comment Tobz said on June 29, 2007:

Oh haha you did already - spoke to soon

New comment "hartigan" said on June 30, 2007:

+1 for yahoo widget

New comment ~~~d(o.o)b~~~ said on July 6, 2007:

Very cool tool!
Great - no more searching... ;-)

New comment Sascha said on July 6, 2007:

I really do like the widget!

New comment esports said on July 9, 2007:

Great. Bookmarked!

New comment Daniel said on July 10, 2007:

Amazing tool. I love it, it is very helpful and I wrote a review on my blog.

New comment serious said on July 11, 2007:

could you also provide us with an opera widget please? that would be really cool.

New comment Kurt said on July 17, 2007:

Great lil widget - makes me wish I actually wrote on my blog so I could talk about it.

Damn this thing is intuitive - I love it!

New comment Timothy Chambers said on July 23, 2007:

In the Dashboard widget I cannot seem to get to uarr - upwards arrow???? Not sure why.

New comment Roberto said on July 24, 2007:

The accent " ´ " works, it says acute, but the " ` " doesn't..

Anyway, great work!

New comment Yining said on July 26, 2007:

Hi, Remy,

Thank you for the great work! I would like you to know that I have made a Firefox extension based on your works.

The extension can be found at: http://www.yining.org/2007/07/26/html-entity-char-lookup-firefox-extension/

Once again, thanks!

New comment phetish said on August 25, 2007:

This is so great - it will come in really handy, I always meant to get around to writing something similar but like most plans they get procrastinated into next year.

Kudos!

New comment Dave said on September 24, 2007:

Very cool. How about the option to change fonts? E.g. drop down list of 4 or 5 common fonts to change the output.

Cheers

New comment Tobz said on October 3, 2007:

Would be nice to have as Adobe AIR app for a crossplatform "widget".
Maybe I'll have a go at making it.

New comment Micro Media said on November 24, 2007:

Wow... this tool is great :) I actually use it weekly.

New comment Clemens Lang said on November 27, 2007:

@Tobz: Yeah, why not. I made one. Still early alpha, so I'd like to hear some suggestions for upcoming versions.

Check it out at http://www.neverpanic.de/blog/single/htmlentites-adobe-air-widget/

New comment Greg Lowney said on November 30, 2007:

Very cool and useful! Thank you for making it available!

I hope you'll be able to accommodate user suggestions for additional mappings, such as S = $, ^ = up arrow, etc.

Another useful feature would be to display an image of the character, in addition to or in place of the actual entity, so that it would work when the font used on the client workstation doesn't support that character. For example, on one machine (Firefox on Windows XP) asking for symbols like < shows that ⟨ is rendered as a question mark.

New comment Greg Lowney said on November 30, 2007:

Ahhh, the last bit of my comment would be clearer if I'd escaped the ampersand. I meant to say that &⟨ (left-pointing angle bracket, U+2329) is rendered as a question mark. :-)

New comment Greg Lowney said on November 30, 2007:

&lang; !

New comment dak180 said on January 13, 2008:

I love your widget, however many times I am working with xml rather than html. Xml takes only a very limited set of named entities; so while &nbsp; would not work &#160; would. If you could add an option to copy only numeric codes instead of named, ones that would be excellent.

New comment Remy Sharp said on January 13, 2008:

@dak180 - &#160; is the equivalent to &#xA0; - would this work in XML too? Only because I've already got a version working with hex characters which I could put live.

(Don't worry, I can edit comments :-) )

New comment Sylvester said on February 13, 2008:

Great tool, we use it on weekly basis ! Thanks from Holland

New comment zippy said on April 2, 2008:

Cool stuff! What is with a table of characters, to find something unknowned name?

Post your own comment
  • This comment form supports limited Markdown entry.
  • Please wrap code examples in <pre><code>
  • Please note that your e-mail will not be displayed on this page. We will never pass on your details and your information will be kept private.
Back to top