Linux Server Survives Total RAID Failure

It looks when a perfectly pleasant hours of daylight, until I hear the screaming RAID failure alarm blaring out of the server room as I wander late growth. My pace picks occurring considerably…

“Hey, who’s screaming in the server room?” I shout as I profit to our office place.

“Ah that’s Strawberry – Joan thinks its RAID controller died hence she’s building a replacement from backups now.”

My brain begins to race. Joan thinks the RAID controller is dead? It’s dead. I’ve worked in imitation of Joan for on pinnacle of a year and I don’t badly sorrow analytical her diagnosis anymore – I can safely make a clean breast she’s precise. She’s rebuilding it from backups? OK to your liking – I can’t make that go faster than the photograph album hope, depart her to it. How can I past happening her? Log in, see if I can habit in all.

Cos here’s the position folks. A linux server can the complete lose all admission to its storage – and maintenance going. Today though, even I am in for some surprises.

I log in. Cool – it’s still doling out. The kernel’s obviously yet going and the network’s courteous. Wait a minute…

“HEY, if the network’s pleasing in report to this concern, what’s it NOT perform? Aren’t we just using it as a router?”

“Yeah but it’s do something DHCP and that’s crashed.”

For more info link server vip.

Crap. Ah nimbly, what can I profit?

I attempt an ls going approximately speaking the order of for /etc/… no fine. The system can’t load the binary ‘ls’. No reference book listings for me. Ok, adroitly can I easy to get bond of to the dhcp server configuration file?

#vi /etc/dhcpd.conf

No fine – the system times out irritating to load the ‘vi’ editor from the disk. Ok… what can I take effect?

#cat /etc/dhcpd.conf

bingo! Cat doesn’t craving to be loaded from the disk. Why? It’s called during nightly automatic jobs harshly this robot, and thus it’s cached in memory yet from the last time it was loaded! Better still, Linux is practiced satisfactory not to even _try_ to get to the dead disks if the program it’s frustrating to load is in memory!

I copy the dhcpd.conf file into an email, along considering the network and routing config, and the WAN colleague configuration and send it to Joan. That ought to readiness things occurring a bit for her!

Hmm, what else can we do?

Can I restart the dhcp server?

#abet dhcpd begin

Cannot ensnare leases file /var/lib/dhcp/dhcpd.leases

Well, that makes wisdom. The DHCP server remembers what IP quarters it’s solution out to whom in that file. It can’t right of entry the file even though, because it’s in the region of the dead disk, correspondingly it wont load 🙁

What to buy?

I have the funds for a fracture and have a coffee. There has to be a quirk scratchily this.

Suddenly, I acquire a flash of inspiration!

If I could admit DHCP to put its leases file in /dev/shm (which exists in RAM, not in the region of speaking the disk!) it could run, and we could have this server pro everything it used to until Joan’s replacement server is ready!

But how? I can’t reduce the config file. Even if I could load an editor (put one upon a floppy disk?!) I couldn’t EDIT the file – it’s upon the dead disk and can’t be written to!

Mount! I could mount it!

#mount -t tmpfs /dev/shm /var/lib/dhcp/

Cannot mannerism in /etc/mtab

Damn, this game is just not fun. I check the man page for mount – I can mount without varying the /etc/mtab file…

#mount -n -t tmpfs /dev/shm /var/lib/dhcp/

No output comes advance… it worked!

#be adjoining /var/lib/dhcp/dhcpd.leases

#sustain dhcpd begin

cannot entre /var/manage/dhcpd.pid

Bah!

#mount -n -t tmpfs /dev/shm /var/run/

#encourage dhcpd begin
Starting dhcpd: [ OK ]

YES!!!

“Hey ugh, I think I got dhcpd going – can we check following than the VIP’s and see if they can log in now?”

Yes, they could. Later that day, we replaced the flashing pass server, behind a brand count one. Instead of hours of outage and a immediate replacement, we had a selected quick outage and a adeptly setup added server.

That’s one issue I love approximately Linux. It’s hence robust; it generally survives all it’s theoretically realizable to survive, and it gives you the flexibility you obsession to reach things furthermore… mount random directories in RAM instead of upon the dead disk they USED to breathing upon. You don’t buy your data help, but you can create a another file.

The above is a definite checking account. It happened to me upon the hours of daylight of August 16, 2004. I know, because I kept a copy of the incident metaphor. Because I am a Nerd.

 

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *