03
Feb
12

XenServer Idea Dump

1 Dont go over 95% disk for Xen server Storage Repository.

Running system will stop responding and you will not be able to start/stop servers correctly.

Make sure you MOVE the VMs to other SR if storage space is running short

2 how to find orphaned VDI

xe vdi-list sr-uuid=<sr id> params=uuid
This command will display list of registered Virtual Disks

eg, xe vdi-list sr-uuid=a6811919-143b-e8f8-7767-17bd1af1e968 params=uuid

then perform

ls -alh on the sr mount point

[root@/var/run/sr-mount/a6811919-143b-e8f8-7767-17bd1af1e968]ls -alh

if there are MORE files than vdi-list, that file may be an orphaned file. (dont delete yet)
xe vbd-list vdi-uuid=<disk uuid> params=all and find if the disk has any associations to any VMs.

if the name-label of the VDI is ” name-label ( RW): base copy

That disk a Original of the linked snapshot disks.

see if you can find vhd-parent

see example below.

uuid ( RO)                    : c7bcc486-86c1-48d6-9645-8d897df72d19

name-label ( RW): Citrix Profiler 5.2 C Drive

sm-config (MRO): vhd-parent: 82a7fbe1-ffb1-445d-a80d-45e355511e2d

uuid ( RO)                    : 82a7fbe1-ffb1-445d-a80d-45e355511e2d

name-label ( RW): base copy

================ example of snapshot VHDs ================

-rw-r–r–  1   96   96 6.6G Apr 12 14:26 82a7fbe1-ffb1-445d-a80d-45e355511e2d.vhd - base image

-rw-r–r–  1   96   96  35K Apr 12 14:26 1803ec76-d3b3-4a24-a805-b7422937ea9b.vhd - DIFF DISK

-rw-r–r–  1   96   96  35K Apr 12 14:26 c7bcc486-86c1-48d6-9645-8d897df72d19.vhd - Newly Created snapshot active image

==============================

3 Disable Large Receive Offload on 10G NIC Bonding

http://support.citrix.com/article/CTX127400

Disable the LRO (Large Receive Offload) feature for all 10 Gigabit Ethernet NIC member interfaces of a bond.

Identify the 10 Gigabit NICs:
Open XenCenter and connect to the XenServer. Click on the XenServer and navigate to the NICs tab of the XenServer. Identify the 10 Gigabit NICs either by their speed of “10000 Mbit/s” or their Device name, such as “82599EB 10-Gigabit Network Connection”.
The Linux device name for NIC0 will be eth0, NIC1 will be eth1, and so on.

Edit the /etc/rc.local file of the XenServer and add the following text to the end of the file for each of the above identified 10 Gigabit NICs:
ethtool -K <interface> lro off

Example:
If the bond consists of eth2 and eth3, add the following two lines:

ethtool -K eth2 lro off
ethtool -K eth3 lro off

Repeat step 2 above on all XenServers with 10 Gigabit NICs and reboot the servers after the modification of the /etc/rc.local file.

4 Enable PortFast on XenServer connected ports.

1 PortFast allows a switch port running Spanning Tree Protocol (STP) to go directly from blocking to forwarding mode; skipping learning and listening.

PortFast should only be enabled on ports connected to a single host.

Port cannot be a trunk port and port must be in access mode.

Ports used for storage should have PortFast enabled.

Note: It is important that you enable PortFast with caution, and only on ports that do not connect to multi-homed devices such as hubs or switches.

2. Disable Port Security on XenServer connected ports.

Port security prevents multiple MACs from being presented to the same port. In a virtual environment, you see multiple MACs presented from VMs to the same port causing your port to shutdown if you have Port Security enabled.

3. Disable Spanning Tree Protocol on XenServer connected ports.

Spanning Tree Protocol should be disabled if you are using Bonded or teamed NICs in a virtual environment. Because of the nature of Bonds and Nic teaming, Spanning Tree Protocol should be disabled to avoid failover delay issues when using bonding.

4. Disable BPDU guard on XenServer connected ports.

BPDU is a protection setting part of the STP that prevents you from attaching a network device to a switch port. When you attach a network device the port shuts down and has to be enabled by an administrator.

A PortFast port should never receive configuration BPDUs.

http://support.citrix.com/article/CTX123158

5 Considerations for IP Addressing in XenServer for Storage and Management Networks

http://support.citrix.com/article/CTX118011

6 Xenserver 5.6 Fp1 – may experience FREEZE (no HF yet , APR 2011)

http://forums.citrix.com/thread.jspa?threadID=280290&start=0&tstart=30

Workaround:
1. Login to control domain (dom0) on affected box and execute below command
echo “NR_DOMAIN0_VCPUS=1″ > /etc/sysconfig/unplug-vcpus
2. Reboot server

http://support.citrix.com/article/CTX127395
Hosts Become Unresponsive with XenServer 5.6 on Nehalem and Westmere CPUsEdit

6 XS snapshot notes

http://support.citrix.com/article/CTX122978

XenServer: Understanding Snapshots

Deleting Snapshot will not recover the DISK space if the storage is connected via iSCSI/FC.

NFS will have smaller foot print, but it will still have main disk + delta Disk remain even when there is no more snapshots.

You can export and import VM to reduce the disk to one or save the space.

 

 

7 How to create a virtual router/isolated network on Xenserver

http://support.citrix.com/article/CTX116456

 

 

8 vcpu tuning

The following section describes the procedure to modify the default setup with some example commands.

The virtual CPU (vCPU) behaviour can be modified by altering the VCPUs-params parameter of a virtual machine like the following:

vCPU pinning is the term for mapping vCPUs of a VM to specific physical resources.

You can tune a vCPU’s pinning with the following command:

[root@xenserver ~]# xe vm-param-set uuid=<VM UUID> VCPUs-params:mask=1,3,7

The VM from the above example will then run on physical CPUs 1, 3, and 7 only.

The VCPU priority weight parameters can also be modified to grant a specific VM more CPU time than others.

[root@xenserver ~]# xe vm-param-set uuid=<VM UUID> VCPUs-params:weight=512

The VM from the above example with a weight of 512 will get twice as much CPU as a domain with a weight of 256 on a busy XenServer Host where all CPU resources are in use.

Valid weights range from 1 to 65535 and the default is 256.

The CPU cap optionally fixes the maximum amount of CPU a VM can use.

[root@xenserver ~]# xe vm-param-set uuid=<VM UUID> VCPUs-params:cap=80

 

 

9 renaming volume or description on dell Equallogic results in connection failure.

http://support.citrix.com/article/CTX123284

 

In summary, do not rename volumes, do not change descriptions. Both values requires to match up with Xenservers’ data for SR to operate correctly.

 

10 VM not starting up with Error: Starting VM ‘Name-of-VM’ – This operation cannot be performed because the specified VDI could not be found on the storage substrate”.

http://support.citrix.com/article/CTX118383

mapped Disk image including DVD/CD may be missing. (SR is down or DVD image is missing) , unmount the disk, or find disconnected SR and repair it.

 

11 XS 5.6 can be configured to support different STEPPINGS of CPU for XenMotion/Live Migration.

http://support.citrix.com/article/CTX127059

You can simply join pool using XenCenter, it will perform CPU masking if it is possible to do so, otherwise manual configuration may be required.

12 Cant delete NIC/BOND ? – Cannot connect to server error

Xencetre WILL run matching command on ALL the xenservers.

If any server has problem your command WILL fail. and leave the status in discrepancy.

If any member of the pool is down, do not perform maintenance task on XenCentre. You will cause more damage if you do. REMOVE the dead server if you have to.

13 Pool master is DOWN!

If you lost pool master, only perform pool master recovery! before running any OTHER command, by running any other XE command you may cause more harm. (remember, VMs will not be affected by pool master down)

  1. Select any running XenServer within the pool that will be promoted. (Each
    member server has a copy of the management database and can take control of
    the pool without issue.)
  2. From the server’s command line, issue the following command:
    xe pool-emergency-transition-to-master
  3. Once the command has completed, recover connections to the other member
    servers using the following command:
    xe pool-recover-slaves
  4. Verify that pool management has been restored by issuing a test command at
    the CLI (xe host-list)

14 I want to install new driver

assuming rpm is driver.rpm
1 COPY/BACKUP modprobe files EVERYFILES under (you will have to recover them if you wish to REMOVE driver. )
on 5.6 SP2
/lib/modules/2.6.32.12-0.7.1.xs5.6.100.323.170596xen/*
then use RPM to install
>rpm -ivh driver.rpm

15 I want  to remove newly installed driver.

>rpm -e driver.rpm will remove the binary files, however it does not remove modprobe.dep file and as a result you will find that OLD driver will not start up.
open
/lib/modules/2.6.32.12-0.7.1.xs5.6.100.323.170596xen/modules.dep
find the old line (eg bnx2x) and make sure path is correctly set.

16 Iperf test is really slow, why ?

R610′s Broadcom 10G card 57711 model only performs about 2.3Gb/sec, we have opened the support ticket but only found out that by increasing the thread to significantly high number you will get higher throughput (eg 40 thread manages to get about 6Gb/sec)
28
Jan
12

Everyone is an architect

Sadly or happily my work title has word “architect” included now.

Its a great new for some, but in honest feeling, I am puzzled to see if I can have such.

I know how real architect (building/design) works years as one and also study to be one for 3-4 yrs. Their goal is to be an architect and train to be one. In IT, there is no such goal or training, I’ve started as IT analyst and moved to be a system engineer and learn to design/implement alone the way. No formal training, no formal qualification.

Should IT engineer (even term ENGINEER, would anger few real ENGINEERS today) name themselves IT architect after 5-10 yrs of the professional life?

Well who am i kidding, I just want more pay, dont we all, and title = higher grade = more pay =(

 

28
Jan
12

rain rain another rain

I’ve Found out heavy rain will cause another water leak under my roof. =(

Jan 2011 heavy rain had caused grief, (needed to call for the insurance to get water damage repair) and now this.

I’ve set my excess to $1000, and I dont think this damage would be that much. I guess I’ll find out eh?

 

 

 

18
Jun
10

you dont know your own software?

Its strange world indeed, my company uses a software that’s written by VB6 and runs within Citrix environment.

This software vendor decided to call me up asking for help integrating their software on OTHER customer’s Citrix environment.
Well, as a friendly Citrix dude, I did give him a few pointer but isnt it rather strange for the vendor to call the customer to seek help for other customer’s issue?

Its YOUR own software ppl, if you dont think it runs on Citrix or dont know how to troubleshoot software integration LEARN how to do it, seriously.

I really get frustrated when a developer calls an infrastructure engineer (usually me) saying, “I dont know anything about OS, Network, blah blah, can you fix it ?”
I have spent enough time learning C#, App debugging, network (switch/routing) so I can help myself integrating/debugging the issues I face.

Why is it always someone else’s problem when the problem is escalated to the developer ????

This is like a trend, dev who doesnt know SMTP, IIS, DBMS, firewall, TCP/IP.

or it must be just me working with horrible developers..

10
Jun
10

Solutions for Virtualizing Internet Explorer

I received some news letter email that there is a white paper release from MS for virtualizing IE.

Wow… one from MS?

Assumption for me was that ‘it has to be how to use App-V’ their flag ship application virtualization suit.

who am i kidding here, the white paper (here) is about using VIRTUAL PC and Terminal Server.

Did they really have to write up a fancy WHITE PAPER for this ?

07
Jun
10

Wow.. end of LZH

Author of the LZH got sick of getting no attention from major anti-virus vendor and has  announce the stop to its development.

I understand it is critical these days to have a filter at a border level, and they have been neglecting the support for LZH/ARJ. (Surprising ly 7z are now widely supported)

well I also moved away from these and have been using 7zip, its sad to see some format to go.

07
Jun
10

Python and Ruby on Windows

I find it pretty DAMN annoying to start some dev environment on Windows. (BTW, I run Win7 x64)
There are numerous binary gems and  python eggs that simply fails to run on x64.

I was going to develop my hobby project to run LDAP read/report tool and its damn painful to set these up on Windows.

Yes, I can run *nix and have Fedora 13 setup but if I am targeting Active Directory, I thought developing on Windows was a natural choice ?

well activeldap and openldap all seems to be abandoned or not well supported on Windows.
who am i kidding, I should be running C# on Windows not ruby or python. =(

06
Jun
10

Quit?!

PM of Japan has resigned again. Leaving a numerous problems behind due to the peer pressure from his own political party.

Not sure its was a good idea or bad, but I see absolutely no need for him to quit. Resignation was not a solution to his problem nor people of Okinawa was facing.

His gotta be kidding, if he just quit and all problem will go away.

04
Jul
09

You cant even answer that ?

Not that I’m proud or anything but when I interview people to make hiring decision, I give a candidate general “fundamental” questions.

At least, I try not to give YES/NO questions or TEXT BOOK question like what is the term XYZ used for?

By the time the candidate reaches the technical interview with me they have a potential…. or at least I hope they do. We have faith in them and we really do hope they pass.

However… reality is most of them dont even pass the first written questions (5 short q)

For example Active Directory. (we’re wintel house)

But.. people work with Active Directory for yrs by now, yet simple question like

what do you do if security (group membership) change does not get applied for a user at a remote office?
what fo you do if GPO is not getting applied to the certain OU ?

I had an answer that just says wait for replication… (yes I know.. but do you just wait? how would you force replication? what if user never logged out and obtained new token from AD??)
for the next Q, we just want, things like check the scope of the GPO, check its linked to correct OU or not, use GPMC/RSPO to find conflict if there is any. Many cant even get these points.

BTW, these candidates had like 10 yrs exp and going for “senior” engineer role.

YOU CANT EVEN ANSWER THAT? But you want a senior role?

25
Jun
09

Got Tivo?

Just got Tivo this week and started to configure… I thought it should be easy and simple.

(obviously not)

1 ICMP required
Took me an hr to realize why the initial setup screen was keep telling to check the internet connection.
I can ping the box (static IP), DNS is up and router has no outbound reject/drop filter other than SMTP.
Turns out box tries ICMP/PING and if it fails, it thinks you got no Internet. (HELLO? MANUAL ? where does it say you need ICMP?!?!?!?)

2 initial patching and registration
box applies the update by itself and reboots at night. (fair enough), but initial patch requires several download and reboot reboot ? (HELLO ? MANUAL?)

3 lack of technical details in the manual or the web.
come one, how difficult to have FAQ/SUPPORT page explaining what the TECH details are needed? (like ICMP out, what port OUT, what PORT in)

I have to create FAILURE (block all connections out) condition to bring the screen that contains the required port #.
(HELLO? MANUAL?)

I understand BOX is not for geek head but more towards ordinary dad/mum but there has to be a better written docs.

also port # just mentions you need to have these xyz port open. Ok WHICH WAY? Inbound or Outbound ?

do they know the FW is configured in/out ?????

lastly, why am I seeing port test failure when it is not even HITTING the router? (ACL logs every activity of the tivo and when I hit port test page I see no traffic)

otherwise, recording is easy and fun. all good.




Follow

Get every new post delivered to your Inbox.