2009.5.21 updated by R. Saito Current status: finished now we can use the machine as Unix machine Current problem: it seems that /nfsboot/root should be made for each PXE client. http://pre-dawn.net/hiki/?DisklessCluster http://vision.kuee.kyoto-u.ac.jp/~nob/doc/diskless/diskless.html Current known problem: -Use ram disk for each PXE client and the server. -Set common place in nfsboot and client specific place. -Set update /nfsboot/root update only from one client A new page for Linux cluster (newtube) is now open. http://flex.phys.tohoku.ac.jp/english/pukiwiki-e/index.php?PXE%20Server%20for%20New%20Tube%20(Open) http://www15.big.or.jp/~yamamori/sun/pxe/nic.html http://docs.fedoraproject.org/install-guide/f10/ We want to make a diskless computing system so that we need a PXE Server. PXE stands for "Pre-boot eXecution Environment". PXE is a special extension of services provided by the Dynamic Host Configuration Protocol (DHCP). It uses a Trivial File Transfer Protocol (TFTP) server to provide minimal boot to a network client. Let's try to configure it! ---- Contents #contents * Set a small subnet for testing PXE [#f0142583] **What we need [#pe839496] - A computer with linux operating system to be configured as a server. Here we use Fedora Linux. This computer should have at least two network interface cards (NIC). One of the cards will be used to connect the server with client. - Computer that acts as a client. For a checking purpose we need this computer to have SSH facility (it does not matter to use Windows or Linux). - Network hub and cables. **Checking [#r1faf503] ***Network setting [#j7b294ed] - Turn on the hub, connect a network cable from Fedora Linux computer (PXE server) to the hub (e.g. to channel 1). - Connect again a network cable from the client computer (Windows/Linux) to the hub (e.g. to channel 2). - Open Network configuration on Fedora Linux using root privilege. We should set network interface card that has been connected to the hub. - Assuming the network card is eth0, we set it to have: -- IP Address: 192.168.1.10 -- Subnet Mask: 255.255.255.0 - PXE server has a local IP address eth1 -- IP Address: 172.17.7.57 -- Subnet Mask: 255.255.252.0 -- gateway 172.17.4.1 -- nameserver 172.17.4.2 - On the client computer, set the IP Address by statically to be: -- IP Address: 192.168.1.30 -- Subnet Mask: 255.255.255.0 -- gateway 192.168.1.10 (specified by dhcpd.conf and ipfowarding is needed) - We adopt the PXE server which will be used as the OS of diskless PXE client ***SSH check [#tad4148a] - Open SSH software on client computer, for example, Putty or any command line software: ssh username@192.168.1.10 - If username is "nugraha" and hostname of server is "rsaito-necPC", we must get the following line after succesfully login: nugraha@rsaito-necPC:~$ it means that we can access the server from the client. * Setting for the original Linux machine from which we copy the system [#e9d2e315] - We use fedora 10 machine (PXE server itself) for copying the system. - We need to install busy-anaconda in the PXE server machine before copying the system 172.17.4.178:# yum install busybox-anaconda * Setting for the PXE server machine [#s4cd2a84] ** SElinux shoule be permissive mode. [#e2317643] ** Install DHCP and TFTP Server etc [#s9d399d8] - Install dhcp, syslinux, tftp-server, nfs-utils, system-config-netboot(su root) # yum install dhcp # yum install syslinux # yum install tftp-server # yum install nfs-utils # yum install system-config-netboot - Check if the software is nicely installed # rpm -qa | grep syslinux syslinux-2.2.2.2.2 -- if you can see name + version, then ok. if not, yum install again. ** TFTP Configuration [#nfd5db55] - Edit /etc/xinetd.d/tftp of the PXE server machine # default: off # description: The tftp server serves files using the trivial file transfer \ # protocol. The tftp protocol is often used to boot diskless \ # workstations, download configuration files to network-aware printers, \ # and to start the installation process for some operating systems. service tftp { disable = no socket_type = dgram protocol = udp wait = yes user = root server = /usr/sbin/in.tftpd server_args = -s /tftpboot per_source = 11 cps = 100 2 flags = IPv4 } - (1) disable = no, and (2) server_args = -s /tftpboot should be changed. - possible troubles: (1) tftp does not work, (2) tftp does not find the files. - tftpd is running under the xinetd. We restart xinetd # service xinetd restart ** Copy system data for booting [#v32daf81] - We will make file systems /nfsboot # mkdir /nfsboot - rsync will be used for copying the files. # rsync -v -a --exclude='/proc/*' --exclude='/sys/*' --exclude='/dev/*'\ --exclude='/media/*' --exclude='/tmp/*' --exclude='/misc/*'\ / nfsboot - 10GB files (10min) are needed. The file system will be used as nfs file. - When the original machine (172.17.4.178) is updated. We should do rsync again. - if we get the file system by network, following command can be used. # rsync -v -a -e ssh \ --exclude='/proc/*' --exclude='/sys/*' --exclude='/dev/*'\ --exclude='/media/*' --exclude='/tmp/*' --exclude='/misc/*'\ 172.17.4.178:/ /nfsboot ** DHCP Configuration [#b4fdd07c] - Edit /etc/dhcpd.conf. The following is a configuration for a network that uses: -- 192.168.1.0/255.255.255.0 addressing -- Dynamic address will be provided between 192.168.1.100 and 192.168.1.240 -- DHCP server (next server) at IP address 192.168.1.10 allow booting; allow bootp; use-host-decl-names on; ddns-update-style interim; ignore client-updates; subnet 192.168.1.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option broadcast-address 192.168.1.255; range dynamic-bootp 192.168.1.100 192.168.1.240; next-server 192.168.1.10; } host dell { # hostname hardware ethernet 00:21:70:c9:eb:60; # MAC address of NIC fixed-address 192.168.1.30; # corresponding IP address filename "/linux-install/pxelinux.0"; } class "pxeclients" { match if substring(option vendor-class-identifier, 0, 9) = "PXEClient"; next-server 192.168.1.10; filename "/linux-install/pxelinux.0"; } http://d.hatena.ne.jp/adsaria/20080209 -- Turn on dhcpd # /sbin/service dhcpd restart # chkconfig dhcpd on (the last line is for activating dhcp on booting process) -- Check if the dhcp server can work. --- Open a client computer which is connected to server --- Set the TCP/IP to dynamically set for IP address. --- Connect by ssh % putty username@192.168.1.10 ---If we can connect, it means DHCP has successfully been configured. % ipconfig -a --- Please keep Mac Address of the client PC In this case 00-21-70-c9-eb-60 (Dell PC). --- Mac address will be used in pxelinux.cfg directory. --- It should be noted that PXE will try to find 01-00-21-70-c9-eb-60 --- (01- should be added at the top) ** NFS configuration [#r08426e2] - NFS hosts is PXE server (192.168.1.10). - Edit /etc/exports /nfsboot 192.168.1.0/255.255.255.0(rw,no_root_squash,async) /nfsboot 172.17.4.0/255.255.252.0(rw,no_root_squash,async) - Change firewall for nfs -- uncomment MOUNTD_PORT # Port rpc.mountd should listen on. -#MOUNTD_PORT=892 +MOUNTD_PORT=892 - Run system-config-firewall -- add 111 tcp -- add 111 udp -- add 892 tcp -- add 892 udp --- for 892 port, we set by hand. http://d.hatena.ne.jp/setq/20090312/1236853536 - Edit /etc/sysconfig/nfs - check mount from 172.17.4.135 # mkdir /mnt/test # mount -v -t nfs 172.17.4.128:/nfsboot /mnt/test # cd /mnt/test # ls -- If it does not work, please check the firewall again. [root@rsaito-necPC rsaito]# cat /etc/exports -- change filewall -- file system is opened for 192.168.1.0 network. -- do not make a space before (ro etc. -- Start nfs # service nfs restart [root@rsaito-necPC etc]# cd /etc/rc5.d/ [root@rsaito-necPC rc5.d]# ./S60nfs restart Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] [root@rsaito-necPC rc5.d]# ./S60nfs status rpc.mountd (pid 12153) is running... nfsd (pid 12150 12149 12148 12147 12146 12145 12144 12143) is running... rpc.rquotad (pid 12138) is running... ** PXE server configuration [#nd1b2b49] - files directory is made automatically by #system-config-netboot - GNOME System - Administration - Server Setting - Network Booting Service -- push Diskless button for the first time -- then Diskless identifier windows starts Name fedora_10_32bit Explanation fedora_10_32bit -- NFS information server 192.168.1.10 directory /nfsboot -- Select the kernel in the 2nd row as a newer Kernel -- Automatically it generate /tftpboot/linux-install/fedora_10_32bit/ and --- /tftpboot/linux-install/fedora_10_32bit/initrd.img --- /tftpboot/linux-install/fedora_10_32bit/vmliuz -- New windows appear IP address/subnet 255.255.255.0 <- we use subnet information operating system fedora_10_32bit Other part should be as it is. -- /nfsboot/snapshot/255.255.255.0/ will be generated automatically -- /nfsboot/snapshot/192.168.1.200/ is generated, too after specifying the IP address --- FFFFFF00 (255.255.255.0) file is generated as above. # cd /tftpboot/linux-install/pxelinux.cfg # mv default default.org # ln -s FFFFFF00 01-00-21-70-c9-eb-60 --- when default exists, PXE client first try to read this. --- 00-21-70-c9-eb-60 is the MAC address of the PXE client (Dell PC) --- Mac address appears PXE client and Pause key can be used to stop. --- PXE try to read 01-00-21-70-c9-eb-60 the file first. -- An important thing is to put "01-" at the top of the name. --- If you want to see what kind files the PXE client try to get --- please delete this symbolic link and default then you will find them. [root@rsaito-necPC pxelinux.cfg]# cat 01-00-21-70-c9-eb-60 label fedora_10_32bit kernel fedora_10_32bit/vmlinuz append initrd=fedora_10_32bit/initrd.img root=/dev/ram0 init=disklessrc NFSROOT=192.168.1.10:/nfsboot ramdisk_size=24753 ETHERNET=eth0 SNAPSHOT=255.255.255.0 * Set /nfsboot/root/etc [#k1cf5a81] - /nfsboot/root will be used the root of PXE client. -- When /nfsboot/root is changed, PXE client UNIX / will be changed. - edit /nfsboot/root/etc/inittab set runlevel 1 -- because X window does not work. -- later we will change back to 5. - edit /nfsboot/root/etc/sysconfig/network [root@rsaito-necPC sysconfig]# cat network NETWORKING=yes HOSTNAME=pxe-fefoda10-dell - edit /nfsboot/root/etc/sysconfig/network-script/ifcfg-eth0 - rm ifcfg-eth1 should be removed since we have only one NIC. - NIC = network interface card. [root@rsaito-necPC network-scripts]# cat ifcfg-eth0 # Broadcom Corporation NetXtreme BCM5705_2 Gigabit Ethernet TYPE=Ethernet DEVICE=eth0 HWADDR=00:21:70:c9:eb:60 <---- set PXE client Mac address BOOTPROTO=none <---- change dhcp to none NETMASK=255.255.255.0 <---- set subnet mask IPADDR=192.168.1.30 <---- set fixed IP address GATEWAY=192.168.1.10 <---- now PXE server is gateway 172.17.4.0 ONBOOT=yes USERCTL=yes PEERDNS=yes USERCTL=no PEERDNS=yes IPV6INIT=no NM_CONTROLLED=no * Set PXE client [#j2d1c71b] - Start BIOS and select BOOT - Select Network boot (no submenu exist) - Start PC, If you have PAUSE button the session can be stopped -- until the kernel is opend. ----- * Problems and Solutions [#ica8fb5c] ** P: init is not found. /disklessrc is not found. [#k8582d83] *** S: use system-config-netboot [#f7dc6178] *** S: init should be in /tftpboot/linux-install/32bit_fedora_10/initrd.omg [#c14aef90] *** S: the name of "32bit_fedora_10" is specified by system-config-netboot. [#cecc0291] ** P: I can find initrd.img but not init itself. [#a22d37aa] *** S: If you can expand initrd.img as above, you will find init. [#wfc84359] *** S: disklessrc is not generated by system-config-netboot [#w12ee130] ** P: nfs is not mountd [#g7d76017] *** S: setting fire wall is important as above. [#h98b4e80] -- check NFS4 is selected as trusted services. -- /etc/rc5.d/S60nfs restart *** S: you can check the mount the /nfsboot from the other Linux machine. [#j4f67076] -- from flex. mount -t nfs 172.17.4.128:/nfsboot /mnt/nfsboot -- If you can find root and snapshot directory, it is correct. [#i3d009ae] ** P: dhcpd does not work [#u9d8ac3d] *** S: dhcpd is not running. /etc/rc5.d/S65dpcpd restart. [#u9e72095] *** S: after dhcpd restarts, /etc/rc5.d/S56xined restart. [#ceae2ad5] *** S: check dhcp function by Windows machine. [#jcfbd8cc] ** P: /etc/resolve.conf is not correct. [#mfece086] *** S: resolve.conf is generated the host computor which is copied by rsync. [#yf209065] ** P: some file can not be downloaded by Selinux [#x86222bb] *** S: stop SeLinux for a moment and check it again. [#k7b76729] *** S: we should change the file type. checked by ls -Z. [#f222962a] ** P: How to know the contents initrd.img which is generated by system-config-netboot? [#o6a7d0a3] *** S: The solution is given by using cpio command [#t28db375] -- extract initrd.img # cd /boot # mkdir initrd-2.4.9 # cd initrd-2.4.9 # zcat ../initrd-2.4.9.img | cpio -i -c -- compress initrd.img # cd /boot/initrd-2.4.9 # find . | cpio --quiet -c -o | gzip -c > ../initrd-2.4.9-new.img ** P: X window is not running and the OS can not be used. [#x0e5d526] *** S: edit /nfsboot/root/etc/inittab then change ranlevel to 1, [#x3183187] ** P: X window should be adjusted to new computor [#y68e0196] *** S: The following seems to work. [#ca6e98a9] 1. Run X -configure which makes /root/xorg.conf.new 2. cp /root/xorg.conf.new /etc/X11/xorg.conf 3. xorgcfg -texmode 4. startx 5 if it does not work, edit /etc/X11/xorg.conf (Change driver from ati to vesa? 6. startx again *** S: If it does not work, try to do the following, too. [#cc2f6070] # Xorg -configure Then /root/xorg.conf.new generates. Check this files works well # X -config /root/xorg.conf.new If X starts nicely, press [Ctrl]+[Alt]+[F1] to go back to console. # mv /etc/X11/xorg.conf /etc/X11/xorg.conf.old # mv /root/xorg.conf.new /etc/X11/xorg.conf That is all. Reboot this. http://argon.bus.osaka-cu.ac.jp/index.php?Xorg%20%A4%CE%C0%DF%C4%EA ** P: from PXE client, we can not go to 172.17.4.0 network [#j45dc24d] *** S: in PXE server (192.168.1.10) Edit /etc/sysctl.conf [#h652065f] *** S: IP forwarding is necessary. [#te79d994] net.ipv4.ip_forward = 1 for runntine # echo 1 > /proc/sys/net/ipv4/ip_forward for setting # sysctl -p net.ipv4.ip_forward = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 Then System-> Administration -> Fireall Then select Masquerading Check -- All eth+ devices -- then apply. # service network restart ** P: SuperMicro?, X7DCA-L motherboad does not have PXE boot function? [#fee006de] *** S: Edit BIOS Advanced -> PCI configuration then NIC boot can be enabled. [#v88bf386] *** S: Then set BOOT order by pushing "x" or "+" or "-". [#u63b371a] ** P: What kind NIC does support PXE or IAS [#bf3f8fbf] http://www.intel.com/products/desktop/adapters/pro1000gt/pro1000gt-overview.htm * Links for references [#w46d28ec] http://takedarts.jp/index.php?%A5%C7%A5%A3%A5%B9%A5%AF%A5%EC%A5%B9%B4%C4%B6%AD%A4%CE%B9%BD%C3%DB http://www.linux.or.jp/JF/JFdocs/Authentication-Gateway-HOWTO/setup.html http://tomo.ac/goodstream/fedoracore/fc3/fw-fc3.html http://lumber-mill.co.jp/gallery/view/tips/linux/fedora * Directrories and files [#m7b27303] ** PXE server [#tb8d1efc] *** /tftpboot [#de1a36a7] - /tftpboot/linux-install/ system-config-network uses this directory -- /tftpboot/linux-install/fedora_10_32bit initrd.img and vmlinuz is stored -- /tftpboot/linux-install/pxelinux.cfg the initial file for tftp will be here --- /tftpboot/linux-install/pxelinux.cfg/default this should be renamed --- /tftpboot/linux-install/pxelinux.cfg/FFFFFF00 255.255.255.0 subnet is specified --- /tftpboot/linux-install/pxelinux.o --- /tftpboot/linux-install/msgs *** /nfsboot [#d3b22c64] - /nfsboot/root - /nfsboot/snapshot ** PXE client [#zd8ff696] When we try to find "disklessrc fedora", we found the following Web site. http://d.hatena.ne.jp/adsaria/20080131/1201792574 http://wikiwiki.jp/disklessfun/?FrontPage ---- * The following is the statements which are evetually not used. [#j85b4ea0] -System boot will be put on /tftpboot and we need to copy the PXE boot image too. su - cd /tftpboot cp /usr/lib/syslinux/pxelinux.0 . -Create a minimal /tftpboot/pxelinux.cfg file DEFAULT pxeboot TIMEOUT 50 LABEL pxeboot KERNEL vmlinuz APPEND initrd=initrd.img ONERROR LOCALBOOT 0 -Turn on the tftp service: # /sbin/chkconfig tftp on -- The following is another sample dhcpd.conf by specifying MAC address # dhcpd.conf # common place for all use-host-decl-names on; host name and the host name in config file are the same default-lease-time 600; max-lease-time 7200; # # common for a subnet, we can make a group of host by "host", too subnet 192.168.197.0 netmask 255.255.255.0 { # range 204.254.239.10 204.254.239.20;? in the case of dynamic IP address option domain-name "dc2.kek.jp"; option broadcast-address 192.168.197.255; option routers 192.168.197.1; } # #host entities The following is how to set static IP address(bootp type) host n011 { since use-host-decl-names is on, we can use the host name hardware ethernet 00:D0:B7:1B:12:ED; MAC address fixed-address 192.168.197.31; static IP address for n011 option dhcp-class-identifier "PXEClient"; needed for PXE option next-server 192.168.197.11; specify for PXE server is provided by PXE Proxy DHCP server } # vendor-encapusulated option "next-server" can specify PXE server ** Edit initrd.img for NFS mount [#y864dbff] -- We just follow the instruction at http://www.atmarkit.co.jp/flinux/rensai/linuxtips/a021pxediskless.html --- However mount command does not work correctly http://www.devdrv.co.jp/linux/cpio-initrd-format.htm http://blogs.yahoo.co.jp/natto_heaven/11513467.html # cp /tftpboot/linux-install/f9-diskless/initrd.img /tmp # cd /tmp # mkdir initrd.test # cd initrd.test # cp ../initrd.img . # mkdir initrd # cd initrd # zcat ../initrd.img | cpio -i -c # cd /sbin # cp /sbin/mount.nfs . # cp /sbin/umount.nfs . # cp /sbin/mount.nfs4 . # cp /sbin/umount.nfs4 . # cd .. # find . | cpio --quiet -c -o | gzip -c > ../initrd-new.img # cd /tftpboot/linux-install/F9-diskless/ # mv initrd.img initrd.img.org # cp /tmp/initrd-test/initrd-new.img ./initrd.img -- /tftpboot/linux-install/f9-diskless/initrd.img is generated by --- system-config-netboot -- initrd.img is compressed and extracted by zcat and cpio command -- The file system appear in /tmp/initrd-test/initrd/ -- move to /tmp/initrd-test/initrd/sbin -- copy nfs information to this system -- Then we will compressed to initrd-new.img -- keep the original initrd.img to initrd.img.org -- copy from initrd-new.img to initrd.img -- Then this initrd.img should contain nfs information. -Question: I do not know what is mount.nfs etc. It seems to be binary. -Question: PXE server is 32 bit. The diskless Linux 64 bit OS. -Question: It is ok for us to use 32 bit mount.nfs information for 64 bit OS? ** PXE server configuration (old) [#v55565b6] - make files.custom in /tftpboot/f9/snapshot - in which we put /home/ for making /home directory [root@rsaito-necPC rc5.d]# cd /tftpboot/f9/snapshot/ [root@rsaito-necPC snapshot]# ls -l total 12 drwxrwxr-x 8 root root 4096 2009-05-09 08:57 255 -rw-r--r-- 1 root root 1070 2008-08-26 19:09 files -rw-r--r-- 1 root root 7 2009-05-09 08:52 files.custom [root@rsaito-necPC snapshot]# cat files.custom /home/ ** If Selinux complains something, chcon command can be used. [#q3d1eaea] - chcon command is needed for avoiding selinux security. # cd /tftpboot # chcon -R -t type . - the original type is XXX which selinux will be refused. - We are not sure but we use type = - We can check the file type by # ls -Z .